Basecamp Export

Morgan Maguire, CEO

Hello all,

Further to my message above,

Rob

and this team are going to get involved in building our front-end searches for ISLG and ILG. They will begin development after the UI features and the SQL databases are complete in October. At that time, Contegra will be able to thoroughly review the new database structures and understand the scope of work needed to complete the necessary tasks.

At this point, we anticipate Contegra performing the following:

Custom Indexer Service: Contegra will create dtSearch indexes containing both the full text and the relevant SQL fielded information, which will include:

a command line interface to create a new index from scratch; and
the ability to check the file system directories for new or changed documents and process updates.

Web Service: Contegra will create a web service, which will include:

creating a web service that will handle all search requests from the client applications (ISLG and ILG);
allowing the client to communicate with the web service using REST-like interface, returning data in JSON format; and
creating a web-service based on .Net that uses IIS.

After these above services are complete, Contegra will then assist us in:

setting up requirements for periodic merging of updates and optimizing indexes;
setting up index backup requirements; and
creating protocols for error handling.

Please let me know if you have any questions about the above.

In the interim,

Jitesh

please review and let us know if you foresee any problems with integrating these plans into your existing development timelines.

Thanks,

Morgan

Sep 18, 2019 at 4:44 PM Notified 9 people

Morgan Maguire, CEO

Hi

Jitesh

,

Following up on my note above, could you please confirm whether you foresee any problems with integrating these plans into your existing development timelines.

Thanks,

Morgan

Sep 23, 2019 at 9:18 PM Notified 9 people

Jitesh Dhuravala

Hi

Rob

and

Radomir

,

We are planning to start dtSearch implementation in ILG as we have need dtsearch indexes for subscriber side search functionality. Before we start, It is good if we take one call and than make it start in order to achieve things which we were discussed in our last call.

Thanks,
Jitesh

Feb 28, 2020 at 6:51 AM Notified 10 people

Morgan Maguire, CEO

Hi

Jitesh

,

I had a call with

Rob

this afternoon, and before we schedule a call with him and

Radomir

, we should get together some documentation that gives them an opportunity to review the requirements for the subscribers side searches, so that they can be more informed when they ask and answer questions.

Melissa

, we discussed previously about getting a document that outlines all the different searches available on the subscriber side. Could you share that document here for

Rob

to review. Also, if possible could you ensure the document includes links to the relevant users stories and wireframes, so that the requirements can be easily reviewed.

Thanks,

Morgan

Feb 29, 2020 at 12:59 AM Notified 10 people

Melissa Cowell, General Manager

Morgan

Sounds good. I can provide this by the end of the week.

Mel

Mar 03, 2020 at 4:08 PM Notified 10 people

Morgan Maguire, CEO

OK. Thanks

Melissa

.

Jitesh

and

Rob

, should we schedule the call next week then? Would Tuesday, March 10th at 8am work for you? Note that I'd like to be on the call to confirm how roles are getting delegated.

Thanks,

Morgan

Mar 03, 2020 at 4:55 PM Notified 10 people

Jitesh Dhuravala

Hi

Morgan

,

Tuesday, 10th March Holiday for us. Please schedule call after 10th any other day.

Thanks,
Jitesh

Mar 04, 2020 at 6:40 AM Notified 10 people

Morgan Maguire, CEO

Right. Of course,

Jitesh

. Does Wednesday March 11th at 7:30am Vancouver time work?

Rob

, please let me know if that works for you as well

Thanks,

Morgan

Mar 04, 2020 at 3:46 PM Notified 10 people

Rob Wiesenberg

Hi

Morgan

,

Unfortunately next Wednesday 3/11 we are not available at 7:30 ASM (Vancouver time. We are available for a call on Tuesday, Wednesday or Friday next week, after 10 AM your time. Thursday next week is wide open. We can have a call earlier in the day. Please let me know if one of those times works for you and Jitesh.

Thanks,
Rob

Mar 04, 2020 at 4:18 PM Notified 10 people

Morgan Maguire, CEO

Ok. Sounds good,

Rob

. Let's schedule the meeting at 8:00 on Thursday, March 12th.

I'll send a calendar invite with details.

Thanks,

Morgan

Mar 04, 2020 at 5:16 PM Notified 10 people

Rob Wiesenberg

Morgan

,

Great, please let me know if you would like to use a call-in number or Skype.

Thanks,
Rob

Mar 04, 2020 at 6:56 PM Notified 10 people

Morgan Maguire, CEO

Hi

Rob

,

We'll be connecting through Zoom. The details are in the calendar invite I sent out earlier.

Thanks,

Morgan

Mar 04, 2020 at 7:00 PM Notified 10 people

Morgan Maguire, CEO

Hi

Rob

,

In preparation for the call on Thursday,

Melissa

has put together a document that outlines the requirements all of the searches that will be available on the subscriber side of the new ISLG application: https://docs.google.com/document/d/10TP4xS4YUgmnznIUI2pzzMA2HOZu1FPm8dE7zmgudtA/edit#heading=h.tuh7ytex0a7y.

The document is broken down into four categories of searches:

Search
Research Tools
Document Library
Other

For each search,

Melissa

has provided screenshots and links to the relevant wireframes. She has also pulled all the acceptance criteria from the relevant user stories. This should give you all the detail you'll need to understand the requirements of the various searches.

For the purposes dtSearch, the only search that performs searches of the document texts is the Full Text Search (and perhaps the Global Search via the Full Text Search), and thus I believe is the only search that will require dtSearch. dtSearch is also used for the Subject Navigator in the old application, because we integrated the Boolean and linguistic options available through dtSearch. However, it appears we discarded these requirements in the new application (

Melissa

do you know if we did that purposefully or is this an oversight in the requirements?)

Let me know if you have any questions or concerns. I would be happy to hop on a call in advance of the call on Thursday to explain anything in the document, so that we can focus on how to optimize all the relevant searches.

Thanks,

Morgan

Mar 09, 2020 at 6:14 PM Notified 10 people

Jitesh Dhuravala

Hi

Morgan

,

We are know how existing dtSearch implemented and working in current application but we are not reached stage of new ISLG subscriber side which

Melissa

has prepared document so we will discuss how to improve existing dtSearch working flow and implementation. We will expect more information from

Rob

as per described by

Melissa

.

Thanks,
Jitesh

Mar 11, 2020 at 12:51 PM Notified 10 people

Morgan Maguire, CEO

Ok. Sounds good,

Jitesh

. The document above was meant for

Rob

's benefit more than yours, so that he can get context for our call tomorrow. We'll discuss more tomorrow.

Melissa

, following up on my comment above concerning the Subject Navigator search, did we purposefully exclude the ability to perform Boolean searches and utilize linguistic functions (which we're currently using in the old application), because this will affect whether we use dtSearch in this search?

Thanks,

Morgan

Mar 11, 2020 at 4:48 PM Notified 10 people

Melissa Cowell, General Manager

Morgan

We went back and forth on this a couple times. The intention was to simplify/standardize the search across research tools. That being said, we did end up modifying the behaviour for the Subject Navigator and integrating boolean and linguistic tools is certainly possible.

We could incorporate the following criteria:

I can enter a keyword search that is powered by the dtSearch search engine

From the SN search input, I will be able to type in a keyword, and run a search
- Keyword search is performed by entering a term or terms and hitting [Search] to submit
- I will be able to search using three different methods
- I can select my keyword search method using a radio button:
  - I can select 'Any words':
    - Searches 'any words' typed into my search field
      - An "any words" search request consists of an unstructured natural language or "plain English" query. In a natural language search request, words such as AND and OR are disregarded. Use quotation marks to indicate a phrase, + (plus) to indicate a word that must be present, and - (minus) to indicate a word that must not be present.
  - I can select 'All words':
    - Searches 'all the words' typed into my search field
      - An "all words" search is like an "any words" search except that all of the words in the search request must be present for a document to be retrieved.
  - I can select 'Boolean' (default selection):
    - Searches keyword(s) entered into search field using 'boolean' logic
      - A boolean search request consists of a group of words, phrases or macros linked by search connectors such as AND and OR to precisely indicate the relationship between them.
- I will be able to enhance my search by selecting various linguistic aids to use
  - I will be able to select 'Stemming'
    - This will search word and its modifications that have the same stem e.g. like, likely, likelihood
  - I will be able to select 'Synonyms'
    - This will also search words that are synonyms of the search term eg. discrimination, bias, favouritism.
  - I will be able to select 'Fuzzy typo' (default selection - 1 character)
    - This will include words that differ in their spelling from the search term by the number of letters selected, e.g favor/favour
    - I will be able to the number of letters to include in the fuzzy type by clicking from 1-10. This option will be disabled if the above check box has not been selected.

Let me know your preference and this can be added to the user story and wireframes.

Mel

Mar 11, 2020 at 5:09 PM Notified 10 people

Morgan Maguire, CEO

OK. Sounds good. Thanks

Melissa

. I'll discuss with

Rob

and let you know.

Morgan

Mar 11, 2020 at 5:15 PM Notified 10 people

Morgan Maguire, CEO

Hi

Melissa

,

Following up on the above, given that we're offering these advanced features in the current application in Subject Navigator, I think we should have them available in the new application as well. Could you please update the applicable requirements in the user stories, and then we'll plan to integrate dtSearch in the searches for the Subject Navigator and the Full Text Search.

Thanks,

Morgan

Mar 11, 2020 at 7:07 PM Notified 10 people

Melissa Cowell, General Manager

Hi

Morgan

Will do.

Harsh

Ketan

please note, this will require minor edits to the Subject Navigator HTML templates.

Mel

Mar 12, 2020 at 2:50 PM Notified 10 people

Morgan Maguire, CEO

Hi everyone,

Following up on the call this morning with

Ketan

,

Jitesh

,

Harsh

,

Rob

and

Radomir

, we've decided to address the search implementation issues as follows:

Rob and Radomir will build the customized indexes necessary for all searches that utilize dtSearch, which will include the following
1. Full Text Search
2. Subject Navigator Search
3. Document Library Searches
  1. Treaties & Rules
  2. Dispute Documents
All remaining searches will be performed using SQL database search, which will include the following:
1. Research Tools:
  1. Jurisprudence Citator search
  2. Article Citator search
  3. Publication Citator search
  4. Terms & Phrases search
2. Research Notepad
3. Document Comparison
Ketan , Jitesh and Harsh will assess the requirements and projected performance of the SQL searches above, will report back if any of these searches will take more than 2 seconds to produce results, and then we will assess whether further customized indexes are required.
Global Search will be performed by sequentially running searches across all the applicable searches, which will include SQL and dtSearch. However, if the performance of the Global Search is insufficient, we will explore the option of building a customized dtSearch index for the Global Search

As a first step, I need to confirm some paperwork with

Rob

. When I confirm that is complete,

Ketan

, could you please provide

Radomir

and

Rob

with the database views they requested over the call. As well, could you please provide them with remote access to the ISLG application server.

Lastly, for ILG, we are likely to adopt of a similar approach of creating a customized index for the dtSearch keyword search, but I would like us to finalize things in ISLG before we start that work. Therefore,

Ketan

,

Jitesh

and

Harsh

, please defer all work concerning search features in ILG until we have developed things further in ISLG. However, if you find that deferring this work is disrupting progress in ILG, please let me know, and we will assess and adjust as necessary.

Please let me know if anyone has any questions or concerns.

Thanks,

Morgan

Mar 12, 2020 at 6:39 PM Notified 10 people

Morgan Maguire, CEO

Hi

Ketan

,

Rob

and I have completed the necessary paperwork. Could you please provide

Rob

and

Radomir

with the necessary database view outline below and remote access to the ISLG application server.

Here is a summary of the database view needed:

Database views from which we can pull data and index. A database view is a searchable object in a database that is defined by a pre-defined db query. In this context each database view will represent the specific data fields that need to be indexed to support the searches for Full-Text Search, Document Library and Subject Navigator, respectively. Though a view does not store data, it can be thought of as a virtual table that can be queried like a table. A view may combine data from more than one table using joins, or just contain a subset of data needed for the purpose of searching a specific dataset. The data views should be created by the team that is already familiar with the data model otherwise considerable time would need to be spent for Contegra to understand the full data model.

Rob

and

Radomir

can provide further detail as required.

Thanks,

Morgan

Mar 19, 2020 at 12:00 AM Notified 10 people

Ketan Sondarva, Technical Project Manager

Hi Morgan,

We will provide database tables structure with fields and related Screen View to Contegra Team by end this week.
We have started working on Subject Navigator module.so, initially Contegra team can start working on Subject Navigator.
As we move along we will keep them updating for other modules as well.

Also, for database access do you need Contegra to have Live Database Server Access (Carbon 60) or local development server access (at DEVIT)?

Thanks,
Ketan Sondarva

Mar 19, 2020 at 10:19 AM Notified 10 people

Morgan Maguire, CEO

Hi

Ketan

,

As discussed this morning, please provide

Rob

and

Radomir

with access to both the live database server (via Carbon60) and the local development server (via DevIT).

For the live server access, please request additional credentials from Carbon60.

Thanks,

Morgan

Mar 19, 2020 at 9:11 PM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Rob

,

Here, I have attached the Subject Navigator Search document which contains Screen View and Database Table Structure.

Search Document Version -1.docx 209 KB • Download

Mar 27, 2020 at 10:35 AM Notified 10 people

Morgan Maguire, CEO

Thanks

Harsh

. When possible, could you please provide similar documents for the Full Text Search and Document Library searches.

Rob

, please let us know whether the document above provides you with the information you need.

Thanks,

Morgan

Mar 27, 2020 at 3:10 PM Notified 10 people

Rob Wiesenberg

Morgan

,

Harsh

, thank you for the document.

Radomir

and I we will review and comment shortly.

Mar 27, 2020 at 3:12 PM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Rob

and

Radomir

,

The above attached document is very basic document. I know you should want to go deeply.

Hence, Just go through that document and if you want further details then we will schedule one call to discuss further which things you want and then will provide you.

Mar 27, 2020 at 3:49 PM Notified 10 people

Radomir Mladenovic

Hi

Harsh

,

Thanks for the document. Yes, it would be good to have a call and go through the details. For example, you list "DocumentValue" but it's not clear if that's a table or a column as I don't see it in the diagram.

Basically, what we need comes down to a couple of questions:
- Which columns are full-text searchable?
- Which additional columns do you need in search results? (e.g. ID field(s) that you use to create content URLs.)

We can schedule a call to go through it together. As it will be quite technical, I guess we can do it without bothering everyone else. Just drop me a note when you're available for a call. We could also use chat instead (e.g. Skype, Slack).

Thanks,
Radomir

Mar 29, 2020 at 11:43 AM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Sounds good to me. Can we do call on next tuesday 6:30 PM IST time??

Here, I have added my skype detail :

Skype id : harsh.parikh05

Jitesh and ketan will also join in this call.

Mar 29, 2020 at 11:48 AM Notified 10 people

Radomir Mladenovic

Hi

Harsh

,

Great, I just sent you a contact invite by Skype. Talk to you on Tuesday.

Mar 29, 2020 at 12:11 PM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Morgan

,

Today, We & Contegra Team taken one call to discuss Subject Navigator Search. We have one question for you.

Up to how many level you want to perform Search ? Do you want search in Dispute Document Full Citation and cited paragraph number ?

Mar 31, 2020 at 2:49 PM Notified 10 people

Morgan Maguire, CEO

Hi

Harsh

,

Questions like this are answered in the document produced by

Melissa

here: https://docs.google.com/document/d/10TP4xS4YUgmnznIUI2pzzMA2HOZu1FPm8dE7zmgudtA/edit.

In the acceptance criteria for Subject Navigator, the keyword search will be performed for the following fields:

Branch Text fields for all Subject Navigator branches
For any Dispute Document associated with a Subject Navigator branch, the applicable:
- Dispute fields: Respondent State, Case Name, Case Number and Special Search Terms
- Dispute Document fields: Short Title and Full Citation

Therefore, it would include the Dispute Document Full Citator (and other fields related to the document), but not the cited paragraph number or text. Going forward, it's very important that you consult this document and the applicable user stories to understand the requirements of each search.

Thanks,

Morgan

cc:

Radomir

Rob

Mar 31, 2020 at 5:39 PM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Morgan

,

Thanks for clarification.

We are taking "Special Search Term" field through Meta Field and other fields are required and hard coded.

Can we take Special Search Terms field as hard coded ? Because only hard coded field we identified for to data model view.

Apr 01, 2020 at 2:03 PM Notified 10 people

Morgan Maguire, CEO

Melissa

, what impact would hard coding the Special Search Terms field have on the master lists? Would this mean filing the field with data would be required, because it will be common for this field to be left blank?

Thanks,

Morgan

Apr 01, 2020 at 5:41 PM Notified 10 people

Melissa Cowell, General Manager

Hi there

Morgan

,

No, hard coded fields are not required. This would be fine.

Mel

Apr 01, 2020 at 5:53 PM Notified 10 people

Morgan Maguire, CEO

Ok. Great. Thanks

Melissa

.

Harsh

, hard coding the Special Search Term field is fine.

Also, I assume this means that any field we integrate into a search will need to be hard coded?

Morgan

Apr 01, 2020 at 6:36 PM Notified 10 people

Harsh Parikh, Tech Lead

Yes. You are right

Morgan

. The search field must be hard coded.

Apr 02, 2020 at 11:26 AM Notified 10 people

Morgan Maguire, CEO

Hi

Harsh

,

Following up on our conversation this morning, it will not be possible for us to hard code all the fields that will be used to populate searches, particular for a number of the filter used in the Full Text Search and Document Library searches. Therefore, please review the search requirements in the search document: https://docs.google.com/document/d/10TP4xS4YUgmnznIUI2pzzMA2HOZu1FPm8dE7zmgudtA/edit, and let us know how you plan to deal with the situation. Note it's very important that we limit the number of hard coded fields, so that we can adjustment fields as required in the future.

Thanks,

Morgan

Apr 02, 2020 at 6:39 PM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Rob

and

Radomir

,

As discussed in last call, here, we have attached the spread sheet for Subject Navigator Search which contains all view columns (currently, it contains dummy data). Also, We have attached query which give us result set of columns.

Please check and let us know if you have any concern.

SN_View_Columns.xlsx 11.9 KB • Download

SN_View_Query.sql 3.15 KB • Download

Apr 07, 2020 at 7:10 AM Notified 10 people

Radomir Mladenovic

Hi

Harsh

,

In which database can I try this query? I was looking at SQL Server at 10.68.138.11 but couldn't find any that contains referenced tables.

I don't really understand the spreadsheet you sent but if the SQL you made returns data that should be indexed, that should be sufficient.

Thanks,
Radomir

Apr 07, 2020 at 8:48 PM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Rob

,

You can try above attached query on ISLGRebuild database on 10.68.138.11 server.

When you fire attached query on ISLGRebuild database then you get the result of columns which we included in attached spread sheet.

Please note that the currently all available data are dummy.

Apr 08, 2020 at 9:10 AM Notified 10 people

Radomir Mladenovic

Hi

Harsh

,
I successfully executed query on ISLGRebuild. Looks fine to me. Could you please just create a view for it so we can do simple "select * from <view_name>" to get data?

Apr 08, 2020 at 4:49 PM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We have cared view for Subject Navigator Search.

You can use Select * From vw_SubjectNavigatorSearch on ISLGRebuild database.

Apr 10, 2020 at 6:57 AM Notified 10 people

Radomir Mladenovic

Thanks

Harsh

, I'll look into it and let you know if anything comes up.

Apr 10, 2020 at 8:23 AM Notified 10 people

Radomir Mladenovic

Hi

Harsh

,

I need a small change to the view... We need a column that can be used as a "document" identifier in the index. As this view was created creating several tables, it's hard to tell what's unique.

Could you please re-create the view but making sure the first column is "id" - e.g. creating it as a string combination of column values and a delimiter. Something like:

CONCAT(branchId, '/ ', ParentId, '/', documented) as ID, ... (continue with columns as in the current view)

I hope this makes sense. Please let me know if you have any questions.

Thanks,

Radomir

Apr 16, 2020 at 10:33 PM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

.

Sorry, I don't understand what you want.

As per my assumption, you want column name which refer to document name in View.

Please clarify or we take short call to discuss.

I am available on skype in all working days. (10:00 AM to 8:00 PM IST)

Apr 17, 2020 at 8:34 AM Notified 10 people

Radomir Mladenovic

Hi

Harsh

, I just need a column (number or string) that can be treated as an identifier of a row. dtSearch requires each document to have a unique identifier. We could use a random value but it's not ideal because prevents eventual use of incremental indexing (e.g we could not update a record as we don't know ID associated with the row).
The problem I have now is that I used the branchId (the first column) for dtSearch document id but, as this is not a unique value in the view, we overwrite records and at the end don't have all rows in the index.
So, I suggested creating an "artificial" ID that consists of all relevant row IDs the row consists of. And, as in the example I gave, to join these IDs together with some delimiter character.
Just put this column to be first in the view, that's how we'll know it's ID column.
If you need additional clarification, feel free to contact me on Skype.

Apr 17, 2020 at 11:08 AM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Can I set Row_num() as unique ID for you as first column ?

Apr 17, 2020 at 11:16 AM Notified 10 people

Radomir Mladenovic

Hi

Harsh

, OK, let's go with Row_num().

Apr 17, 2020 at 2:11 PM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We have added Row_num() as unique Id in view. We have given alias Id and set as First Column.

Please check and confirm.

Apr 20, 2020 at 6:19 AM Notified 11 people

Radomir Mladenovic

Hi

Harsh

,

Yes, the updated view looks good. I indexed it and the index is in "C:\Temp\test-index\subject-nav" on the Web server.

Now, do you want us to create a helper library to consume indexes or prefer to do it on your own? Would you prefer a Controller to access it as a web service, or a library (DLL)? Which C# framework version do you use?

Apr 20, 2020 at 12:57 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Yes, We want from you to consume indexes. We are using .Net Core 2.1 version and C# version is 7.0

Apr 20, 2020 at 2:00 PM Notified 11 people

Radomir Mladenovic

Harsh

OK, but ho do you prefer to use it? As a library, so you call it from code, or as a Controller to call from JavaScript?

Apr 20, 2020 at 2:02 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We are preferring Controller to call from JavaScript. Because, IF there is some customize we need then we can do easily.

This is my opinion. You can say your best approach.

Apr 20, 2020 at 2:06 PM Notified 11 people

Radomir Mladenovic

Hi

Harsh

,

I prepared the first version of the search web service. It's on the web server in the "D:\TologixWebSearch". It's .Net Core 3.1 application.
Could you install this app on the webserver so I can finalize the setup?
To check if it's running, you can send POST /api/search/subject-nav with parameter searchRequest=branch for example.

BTW, I just now realized that you said you're using Core 2.1, not 3.1 - if that's a problem let me know and I'll downgrade code.

Thanks.

Apr 23, 2020 at 7:26 AM Notified 11 people

Harsh Parikh, Tech Lead

Yes

Radomir

. We are using .Net Core 2.1 Version.

Apr 23, 2020 at 11:14 AM Notified 11 people

Radomir Mladenovic

Harsh

I downgraded the search application to .Net Core 2.1. It's in the same folder, "D:\TologixWebSearch"

Apr 23, 2020 at 11:37 PM Notified 11 people

Radomir Mladenovic

Hi

Harsh

, did you have a chance to install the search application?

Apr 28, 2020 at 2:42 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

I am not go through the search application. I am busy with to complete other stuff. I will go through by within one or two day and get back to you.

Apr 28, 2020 at 2:53 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

I go through following path on ip server : 10.68.138.10

E:\TologixWebSearch

I found that you put the published code in folder. Am I Right ?

Can we take call tomorrow 6:00 PM IST (Ahmedabad Time) ?

Apr 28, 2020 at 3:02 PM Notified 11 people

Radomir Mladenovic

Hi

Harsh

, yes, that's the publish folder.
Yes, we can talk Wednesday 6:00 PM IST

Apr 28, 2020 at 5:48 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Morgan

,

We & Radomir took call for Subject Navigator search and all are going in good manner.

If we get the source code which radomir developed for SN as demo purpose then it is good for us. hence, we can test and get idea to do same thing in our local environment.

Radomir

, Could you please confirm for that ?

Apr 29, 2020 at 2:11 PM Notified 11 people

Morgan Maguire, CEO

Great

Harsh

. Look forward to seeing it when it's ready to be deployed as a demo.

Thanks,

Morgan

Apr 29, 2020 at 2:59 PM Notified 11 people

Harsh Parikh, Tech Lead

Morgan

, Could you confirm that Contegra team can provide their source code to us as demo purpose?

Apr 29, 2020 at 3:00 PM Notified 11 people

Morgan Maguire, CEO

Radomir

and

Rob

, can you respond to

Harsh

's inquiry above about the source code?

Thanks,

Morgan

Apr 29, 2020 at 3:07 PM Notified 11 people

Morgan Maguire, CEO

Hi

Harsh

,

Rob

said the source code will be delivered when the entire project is complete. However, he said that you should be able to perform all the testing you need to do in the interim.

Is there a reason you need the source code now? If you so, please specify why, and we can work something out.

Thanks,

Morgan

Apr 29, 2020 at 3:11 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Morgan

,

The source code just we need to check how custom dtSearch indexing working, how we can pass input parameter model and how the data will get in json format.

We need source code only one time. for example, radomir devloped web app for subject navigator. once we get all the idea then we don't want for rest of modules.

This is first module we are going to implement so we need to clear from our side that all are going in proper manner.

Apr 29, 2020 at 3:17 PM Notified 11 people

Radomir Mladenovic

Hi

Harsh

, as discussed, I've updated the service to accept JSON POST requests. Application is in the same folder (E:\TologixWebSearch) and sample index is in the C:\Temp\test-index\subject-nav

Example: POST /api/search/subject-nav
{
"searchRequest": "pink link",
"searchType": "Phrase"
}
or
{
"searchRequest": "pink link fooooo",
"searchType": "AnyWords"
}

Here's the object model from where you can see all available options:

public class SearchModel
{
public string SearchRequest { set; get; }

public int PageNum { set; get; }
public int PageSize { set; get; }
public bool Fuzzy { set; get; }
public int Fuzziness { set; get; }
public bool Stemming { set; get; }
public bool WordNetSynonyms { set; get; }
public bool Synonyms { set; get; }
public bool PhonicSearching { set; get; }
public SearchType SearchType { set; get; }
public string SortField { set; get; }
public string SortOrder { set; get; }

public int SearchFlags { set; get; }

public bool EnableDateSearch { set; get; }
public DateTime? StartDate { set; get; }
public DateTime? EndDate { set; get; }

public string FileConditions { set; get; }

public string BooleanConditions { set; get; }

public bool IncludeSynopsis { set; get; }

public int Near { set; get; }

public bool ExcludeEnabled { set; get; }
public string ExcludeTerm { set; get; }

public SearchModel()
{
IncludeSynopsis = true;
Stemming = true;
Fuzziness = 4;
SearchType = SearchType.AllWords;
Near = 14;
SearchRequest = null;
}

public enum SearchType
{
NoValue,
AllWords,
AnyWords,
Boolean,
Phrase,
NearTerm
}

Sort order options:
scoredesc
scoreasc
hitsdesc
hitsasc
locationasc
locationdesc
documentasc
documentdesc

Let me know if you have any questions.

Apr 29, 2020 at 6:15 PM Notified 11 people

Morgan Maguire, CEO

Hi

Harsh

,

I've got confirmation from

Rob

that

Radomir

will be providing you the source code for the Subject Navigator module in due course.

Thanks,

Morgan

Apr 30, 2020 at 5:16 PM Notified 11 people

Radomir Mladenovic

Hi

Harsh

, you can get the project sources from
https://www.dropbox.com/s/wffgqgnv1qx6rde/contegra-tologix-master.zip?dl=0

May 01, 2020 at 8:35 AM Notified 11 people

Harsh Parikh, Tech Lead

OK Thanks

Radomir

.

We will check and try it and let you know if we face any issue.

May 01, 2020 at 8:37 AM Notified 11 people

Morgan Maguire, CEO

Hi

Harsh

and

Ketan

,

I got a note from

Rob

this morning asking whether we had any feedback on the customer indexer for the Subject Navigator. Could you provide an update on where we stand on this issue. Is the indexer performing as expected? Also, would it be possible for me and my team to get a preview on how it works?

Also, what are next steps in starting work on the next indexers (Full Text Search and Document Library searches: https://docs.google.com/document/d/10TP4xS4YUgmnznIUI2pzzMA2HOZu1FPm8dE7zmgudtA/edit)? Phase 2 of Subscriber side development includes the Disputes & Dispute Documents, which is scheduled to start development on May 21 and will include the Document Library searches. But I'm wondering if

Rob

and

Radomir

could get started on their work in advance to ensure the indexers are ready when the relevant development phases start?

Thanks,

Morgan

May 11, 2020 at 6:39 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Morgan

,

We are planning to integrate Subject Navigator Indexes after 20th May (after completing the core development of phase 1).

For Dispute & Dispute Document Library, We will take in phase 2 development and taking priority to this first.

For Full Text Search, It is big module and will set priority in Phase 3.

Ketan

, will talk more closely with you in today's call.

May 12, 2020 at 7:07 AM Notified 11 people

Morgan Maguire, CEO

Ok. Thanks

Harsh

.

After speaking with

Ketan

this morning, he'll speak to you and the team about coming up with a specific date on when

Rob

and

Radomir

can expect to start working on the next set of custom indexes. We'll discuss further during Thursday's call.

Thanks,

Morgan

May 12, 2020 at 4:59 PM Notified 11 people

Morgan Maguire, CEO

Hello everyone,

Further to discussions and emails with

Rob

and

Ketan

this week, it sounds like we are nearing the point of getting

Rob

and

Radomir

back involved in the project.

To start things off,

Ketan

, could you please provide more details on the timeline for

Rob

and

Radomir

's involvement, and what the steps and expectations will be over the next few weeks.

Thanks,

Morgan

Jul 22, 2020 at 5:22 PM Notified 11 people

Ketan Sondarva, Technical Project Manager

Hi Rob, Radomir,

We will start integrating Subject Navigator Index provided by you on first week of Aug. Meantime we will share you SQL View of Dispute & Dispute document module by 5th Aug so your team can start working on index creation for the same. For FTS (Full Text Search) we are planning to send you SQL View by mid of Aug as till then we can finish our integration with Subject Navigator & working on Dispute & Dispute document.

Also, I would like to know how much time it will take to create index for such module in general if we provide proper details of SQL View in a given time. So, we can plan our development & integration accordingly.

Thanks,
Ketan Sondarva

Jul 23, 2020 at 6:22 AM Notified 11 people

Radomir Mladenovic

Hi Ketan,

It would be great if you could provide SQL Views as soon as possible so we can review and start working on it. It's hard to tell how much time it's needed for indexing - depends on the amount of data, the number of files to be index, system and database performance, etc. Even if you don't have all documents ready but have a decent amount, we can test with it and get some numbers from it.

Note that my availability in August is limited. I have ongoing projects but I can still dedicate some hours for your project in the first half of Aug. However, from Aug 15, I have scheduled vacation and on-site work planned so my availability will be very limited until the second week od September.

Thanks,
Radomir

Jul 25, 2020 at 6:58 PM Notified 11 people

Ketan Sondarva, Technical Project Manager

Hi Rob,

Thanks for update and noted as well.

As we have some query in Dispute & Dispute document SQL View, if we can connect on 29th or 30th July, 2020 by 6:00 PM IST to discuss further on our query which will give you better idea about SQL View and you can also discuss any query from yourside.

Thanks,
Ketan Sondarva

Jul 28, 2020 at 8:03 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Rob

and

Radomir

,

We have some quires for Dispute & Document SQL View. are you able take call on Skype tomorrow (30th July) at 6:00 PM IST?

Jul 29, 2020 at 3:20 PM Notified 11 people

Rob Wiesenberg

That time works for me.

Radomir

are you available at that time?

Jul 29, 2020 at 3:23 PM Notified 11 people

Radomir Mladenovic

Hi

Harsh

, unfortunately, I'm not available tomorrow at that time. Tomorrow I can do it 11:00 AM IST - it's too early for Rob but, as it's a technical call, I guess it's not necessary for him to participate.

Rob

, is that ok with you?
Otherwise, I'm available on Friday (31st July) at 6:00 PM IST.

Jul 29, 2020 at 6:02 PM Notified 11 people

Rob Wiesenberg

Radomir

/

Harsh

, I can meet at July 31 at 6:00 PM IST but if that is not convenient please go ahead and meet a convenient time without me.

Jul 29, 2020 at 6:06 PM Notified 11 people

Radomir Mladenovic

Harsh

I copied DB indexer to folder E:\Programs\TologixDBIndexer on 10.68.138.10. Talk to you tomorrow.

Aug 03, 2020 at 9:32 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Rob

and

Radomir

,

We have few queries regrading search result in dispute & document library module which we will discuss with Industrial team on Monday then will provide SQL view of Dispute & Document Library.

Aug 06, 2020 at 11:14 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Rob

and

Radomir

,

We are working on Subject Navigator Contegra Search Integration in our application and we are getting the search result through API call.

For next module Dispute & Document Library following I have added SQL View.

vw_DisputeDocumentLibrarySearch on ISLGRebuild databse.
(server :10.68.138.11)

But, I have some concern regrading search parameter.

Dispute&Document_Parameter.png 93.6 KB • Download

As per above screenshot, there are so many search parameters we need to pass while we click on filter and find the result according to parameter which we selected.

Suppose for example, In text box we enter keyword "ICSID" and select the Language "English" from search parameter then we need to get result from indexing who match the result with ICSID keyword and English language.

So, my question is how the indexing will return the result by different parameter which i selected ?

If you need to discuss then we are available on 13th August as tomorrow we have a national holiday.

Please suggest.

Aug 11, 2020 at 11:02 AM Notified 11 people

Radomir Mladenovic

Hi Harsh,

You need to pass additional filters via Custom object, for example:

{
"searchRequest": "pink link",
"searchType": "AnyWords",
"custom": {
"language": "English"
}
}

You can see the implementation of this filtering in the SearchController.cs, lines 201-217.

Hope this helps. let me know if you have any questions.

Regards,
Radomir

Aug 11, 2020 at 8:48 PM Notified 11 people

Radomir Mladenovic

ExampleCustomModel.zip 1.27 KB • Download

Harsh

, as discussed, I'm sending you an example attached how you can deserialize search engine response to a custom model. After deserialization, you can simply access fields as in:

response.Results[0].Data.Filename

Hope this helps.

Aug 15, 2020 at 11:00 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Rob

and

Radomir

,

As discussed on Friday (14/08)'s call, We have added HierarchicalParentIds column in Subject Navigator SQL view.

vw_SubjectNavigatorSearch

We have updated the SQL view on ISLGRebuild databse on 10.68.138.11 server.

We have remian ParentId column as it is and added new column HierarchicalParentIds to get multiple parentId in comma separator.

The HierarchicalParentIds column contains multiple ParentIDs with comma separator as per following screenshot.

SN_SQLView.PNG 37.7 KB • Download

Please let us know if you have any concern.

Aug 17, 2020 at 6:57 AM Notified 11 people

Radomir Mladenovic

Harsh

I updated indexer to make it run as a command-line tools as well.
You can get it from https://www.dropbox.com/s/fxd5acvnpxw7h6e/DBIndexer-with-cmd-line.zip?dl=0
If you run TlogixDBIndexer and pass a filename as an argument (which is a config file), it will run in the command line mode only. You can find sample config file "index-config-subject-nav.json" in the archive.

I'll look at the parents field soon.

Aug 17, 2020 at 4:47 PM Notified 11 people

Radomir Mladenovic

Harsh

I have one more indexer update for you:
https://www.dropbox.com/s/riuyasnjn9rv7ll/DBIndexer3.zip?dl=0
It includes 3 dtsearch data files that define some indexing behavior (e.g. noise words). The code was updated to load them from the application folder. (This change was needed to fix behavior where single letter terms were not indexed so, for example, document ID with value "1" could not be found.)

Next, I implemented support for finding and returning parent nodes listed in the HierarchicalParentIds column. The search response payload now includes parents field witch is the same list of documents. Sample response looks like this:

{
    "totalResults": 7,
    "results": [
        {
            "fields": {
                "id": "35",
                "branchid": "35",
                "branchtypeid": "3",
                "branchname": "Test new subject",
                "Filename": "db://vw_SubjectNavigatorSearch#Id=35",
                "parentid": "20",
                "documentid": "15",
                "contenttypedatamasterid": "29",
                "disputedocumentshorttitle": "OT/0001/03 - NJ Test Case AF-0005-01 Destination Code - 31/08/2020 - English",
                "shorttitle": "NJ Test Case AF-0005-01 Destination Code",
                "fullcitation": "NJ Test Case, AF-0005-01 Destination Code, 31 August 2020",
                "casename": "NJ Test Case"
            }
        },
...
    ],
    "parents": [
        {
            "fields": {
                "id": "27",
                "branchid": "27",
                "hierarchicalparentids": "1,27",
                "branchtypeid": "3",
                "branchname": "Abuse of Process",
                "Filename": "db://vw_SubjectNavigatorSearch#Id=27",
                "parentid": "1"
            }
        },
...
        {
            "fields": {
                "id": "1",
                "branchid": "1",
                "hierarchicalparentids": "1",
                "branchtypeid": "3",
                "branchname": "A",
                "Filename": "db://vw_SubjectNavigatorSearch#Id=1"
            }
        }
    ]
}

Note that the parent list contains only elements which are not present in the results list.

You can get search service update from:
https://www.dropbox.com/s/1y4e4tke94sqngn/TologixWebSearch.zip?dl=0

Aug 17, 2020 at 8:02 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Rob

and

Radomir

,

We need all results data including Parents and branch both in one array. We don't need different array for Parents data.

Could you please modify this and let us know. Also, the above dropbox link is not working.

Aug 18, 2020 at 5:48 AM Notified 11 people

Radomir Mladenovic

Hi

Harsh

,

I separated results and parents with reason. How would you know otherwise in which node there's a hit and which node is there only as a support? It's a one liner code on your end to join both arrays if you really need them that way. But, if you're sure you don't need separation between hits and non-hits, I can join them in the service. Please confirm.

BTW, which Dropbox link is not working? I tried both from the previous message and they both open.

Aug 18, 2020 at 6:59 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

What is the purpose of separation of hits and non-hits which we don't know.

Our understanding is we need to pull all branch and parent data result in one array and we will convert that result as per our model and pass to presentation layer.

Could you please clarify what is the meaning and use of hits and non-hits separation ?

Also, the above Dropbox link is not open from our end. I have attached screen shot for your reference.

dropboxlink.PNG 21.8 KB • Download

Aug 18, 2020 at 8:41 AM Notified 11 people

Radomir Mladenovic

I thought you might need info where the actual hits are. If you don't care, I'll merge the arrays - it's easy to merge, not that easy to separate if you need hit info. I'll update the code and send you update later today.

As of Dropbox, check your network. Works fine for me.

Aug 18, 2020 at 8:58 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We don't need to display hit count so just merge the array and let us know once you updated.

We will again use dropbox link and pull the service in our local environment.

One more question regarding above Examplecustommodel.zip file, Shall we have to create different result model for each tool ?

Suppose, above example model is work for SN but for DisputeDocuemntLibrary we need to create another model ?

Aug 18, 2020 at 9:27 AM Notified 11 people

Radomir Mladenovic

Hi

Harsh

,

I made a change to the search service to add parents to the results list.
You can get it from https://www.dropbox.com/s/8va4x1q2jkzptls/TologixWebSearch2.zip?dl=0

As of the custom model, if for DisputeDocuemntLibrary or other index you have different fields, you will need a different model. I mentioned this as a downside of a custom model during our last call.
What you could do, is to make one model with all possible fields you have, across all different indexes you're going to create. It's messy but it's a single model.

Aug 18, 2020 at 6:15 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

and

Rob

,

We are not getting all parent ids result from indexes. We looked in to your TologixWebSeacrh Application and found that you are using Ids in place of BranchIds.

Can we take call on Monday (31st August) 2:00 PM IST for Subject Navigator Indexes result ?

Aug 28, 2020 at 10:25 AM Notified 11 people

Radomir Mladenovic

Harsh

I'm not available for a call on Monday. But do we need a call for this?
Please send me details what exactly is the issue and you'll have a fix by Monday.

Aug 28, 2020 at 11:25 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Actually, You are using Id column instead of branch id for search Parent Ids and Parent Ids all result data.

You are using Id column in Line no. 112, 84. We have changed and set branchid instead and we get all results.

Please confirm that Is it OK ?

Aug 28, 2020 at 11:32 AM Notified 11 people

Radomir Mladenovic

Hi

Harsh

,
So,if I understood correctly, you modified the view to use branchid in the id field and now it works fine? Sure, if you see no side effects, it's fine.

Aug 28, 2020 at 3:54 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

I was not modified the view. I have used branchid in place of id in line no. 112 and 84 in your tologixwebsearch application and then publish that code again and check.

it works fine.

The change we made in searchcontrol.cs file in line no. 112 and 84 is Ok?

Aug 28, 2020 at 4:02 PM Notified 11 people

Radomir Mladenovic

Hi

Harsh

, sure that's fine. I made the same change on my side so we're in sync.

Aug 29, 2020 at 1:18 PM Notified 11 people

Harsh Parikh, Tech Lead

OK Thanks

Radomir

. Also, we need to know that how we can highlite the search keyword using dtSearch.

Are you available to take call on tuesday ?

Aug 29, 2020 at 1:48 PM Notified 11 people

Radomir Mladenovic

Harsh

on Tuesday I'm available at 11:00 AM IST. Let me know if that works for you.
In general, we can add highlighting by extending the search service. Highlighted text could be added as a field to the JSON result object.

BTW, I saw your message on skype:

> we need to know how we generate indexes on daily basis whiout do manual process to generate indexes

I already sent you indexer update that runs from the command line. See my message and Dropbox download URL that I sent on Aug 17 above. You need to setup Windows Scheduler to run the indexer and that's it.

Aug 29, 2020 at 2:27 PM Notified 11 people

Harsh Parikh, Tech Lead

Yaa

Radomir

. we are available tuesday 11:00 AM IST. we saw your message for automated indexing but we are not able to do that.

We will discuss on tuesday.

Aug 29, 2020 at 2:52 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Here, I have attached SearchController.CS file which we made changes for Subject Navigator Index.

SearchController.cs 21.5 KB • Download

.
Please replace the above file in your project.

As discussed, we need to highlight the match word in presentation view as we are currently did with Jquery Highlighter.

The word will be highlighted from following columns.

Respondent State, Case Name, Case Number and Special Search Terms, Short Title and Full Citation.

Also, Please let us know when we schedule a call to generate indexes through command line.

Sep 01, 2020 at 6:14 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

As discussed per today's call, We need to provide 2 SQL Views. 1st SQL View contains all dynamic columns with column name (ex. Field_100, Field_101) and 2nd SQL View contains all columns which we need to bind in our model.

As discussed, You will get one column DisputeId from 1st SQL View and you need to pass DisputeId column value to 2nd SQL View and provide us JSON format result to bind data as per our model.

But, We have sent one mail to you regarding 1st SQL View. Due to Dynamic column structure, It is not feasible to create 1st SQL View. Hence , currently we are providing Stored Procedure in place of 1st SQL View which contains all dynamic columns.

We are able to genearte 2nd SQL View because that view contains Fixed column.

Here, I have provided you Stored Procedure and 2nd SQL View.

Stored Procedure name (In Place 1st SQL View) : FE_MetafieldwithValueDynamic

2nd SQL View : (Pass the DisputeId column from which you get from above stored procedure)

VW_DisputeContegraSearch

You can use ISLGRebuild databse on 10.68.138.11 server to find the stored procedure and SQL view.

Please let us know that stored procedure is work from your side or not or we are ready to discuss over call.

Sep 09, 2020 at 12:58 PM Notified 11 people

Radomir Mladenovic

Hi

Harsh

,

Which field identifies a "document" for this index? I think we mentioned ContentTypeDataMasterId in the past but now you reference the DisputeId. Should DisputeId be used as a document identifier?

I checked the stored proc result and the view. A possible problem I see is that for DisputeId=23 view VW_DisputeContegraSearch returns multiple rows. I can index data as received but in the results you will not be able to tell which field goes with which ContentTypeDataMasterId. (For example, you have Field_299=15532 for ContentTypeDataMasterId=58, but ContentTypeDataMasterId=59 has the same DisputeId so will be part of the same document.)
Is that how you wanted things to be indexed? Please confirm.

I'll be available on Skype in the following hour or two if you want to discuss it quickly.

Sep 10, 2020 at 10:37 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Yes Consider DisputeId column for document identifier.

For 2nd Point, I will ping you on skype right now.

Sep 10, 2020 at 10:39 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Thanks for quick call today. As discussed, We need to pass DisputeId column in 2nd SQL View and get results from indexing of all 2nd SQL View Columns.

Also, We get Search result count and highlight the matching word from all the columns.

Sep 10, 2020 at 11:17 AM Notified 11 people

Radomir Mladenovic

Hi

Harsh

,

After closer examination of the data returned by the stored proc and the view, I still see some issues.

The problem is that both return rows with duplicate DisputeId values. For example, multiple rows exist for both DisputeId 7, 30, etc. As we use DisputeId for a document identifier, any duplicate appearing in the first table will overwrite the previously indexed document with the same DisputeId.

1) It's crucial that the first indexed table/view (here we use the result of the stored proc) returns rows that have unique document identifier. For the Disputes Search it makes sense that the DisputeId is used, but values should be distinct.

2) The second view indexed may return multiple rows for the same DisputeId and all columns will be indexed as a part of the same indexed document.

Let me know if you have any question.

Sep 10, 2020 at 9:36 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We are not getting what you want to suggest. As discussed, We need to Pass DisputeId column value from stored procedure data to 2nd SQL view for getting all Dispute and Document related data which we need to display in subscriber side.

If you want unique column then you can use ContentTypeDataMasterId column.

IF you want to take call then we are available now till 6:00 PM IST. Also, we are available on Monday 10:30 AM to 5:00 PM IST.

Please let us know.

Sep 11, 2020 at 9:53 AM Notified 11 people

Radomir Mladenovic

Hi

Harsh

,

The problem is not indexing of the second view. The problem is with the first view (stored proc). Each row in the table returned by the stored procedure should correspond to one indexed document (and that will be returned in results as one matching document).

My understanding is that your result item for this search is a Dispute. That means that you should have only one row per dispute in the first view indexed. As you have the same DisputeId in multiple rows, I suspect this is not the case.

BTW, I can index what you provided so far, and will can continue my work. However, my concern is that the data is not prepared as it should be which will result in an incomplete index and wrong/incomplete search output.

I hope this makes it more clear. Please let me know.

Sep 11, 2020 at 11:06 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Yes.. You will get same DisputeId row. As discussed in last call, we are grouping our 2nd SQL View data in C# side. so when if get mutiple row data with same disputeid then will do grouping in C# side and then will present data for user.

Please let us know if you want to discuss over call.

Sep 11, 2020 at 11:52 AM Notified 11 people

Radomir Mladenovic

Hi

Harsh

,

If we use DisputeId for index document identifier, you cannot get multiple rows - you would get only match for the last row as it will overwrite previous rows with the same id.
So, is using ContentTypeDataMasterId for the document ID fine?

Sep 11, 2020 at 12:14 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

I think we need to take one short call to sort out this. Please let us know when you are available to discuss over call.

Sep 11, 2020 at 12:19 PM Notified 11 people

Harsh Parikh, Tech Lead

Thanks

Radomir

for quick call.

As discussed now, you can take ContentypeDataMasterID column as unique column and by using this column you will get value of DisputeId column and pass this DisputeId value in 2nd SQL View.

Hope this is fine.

Sep 11, 2020 at 12:59 PM Notified 11 people

Radomir Mladenovic

Hi

Harsh

,

I'm still working on the search service changes - you should have it tomorrow. In the meantime, here's the JSON structure that you can use to send your query:

1) Boolean queries nesting:

{
	"type": "boolean",
	"operator": "and"
	"clauses": [
		{
			"type": "boolean",
			"operator": "or"
			"clauses": [
				...
				another boolean or match type
			]
		}

	]
}

2) Field value matching:

{
	"type": "match",
	"exclude": false,
	"field": "Field_100",
	"value": "3"
}

or with multiple values:

{
	"type": "match",
	"exclude": false,
	"field": "Field_100",
	"values": ["3", "7", "14"],
        "operator": "and"
}

Obviously, the "exclude" field is if you want to exclude documents having field values. The "operator" tells how to combine multiple values - are they combined using OR or AND.

I hope this covers all your cases. Let me know if anything is missing.

Sep 14, 2020 at 5:25 AM Notified 11 people

Harsh Parikh, Tech Lead

OK Thanks

Radomir

. Can we take call on Wednesday between 10:30 to 5:00 PM IST to discuss JSON structure. meantime, we can discuss internally and then we both discuss on Wednesday.

Please confirm your connivance time.

Sep 14, 2020 at 5:33 AM Notified 11 people

Radomir Mladenovic

Harsh

we can talk Wednesday 11:00 AM IST.

Sep 14, 2020 at 7:15 AM Notified 11 people

Radomir Mladenovic

sample-disputes-search-request.txt 629 Bytes • Download

sample-disputes-search-response.txt 7.26 KB • Download

Hi

Harsh

Here's an update for you: https://www.dropbox.com/sh/iufr961llu2wrht/AADzxZAmXEcDn-hotEyY3CkNa?dl=0

1) DBIndexer is updated to index second view. Check the indexer-config-disputes.json config file in the zip file - you probably just need to change the index path where you want tosave the index.

2) TologixWebSearch - copy CS files to update the existing project files. In the appsettings.json you can see that new parameter was added ("DisputesIndex") that should point to the Disputes index,

Search service endpoint for Dispatch search is: /api/search/disputes

I'm attaching example request body request and response.

Sep 14, 2020 at 8:40 PM Notified 11 people

Radomir Mladenovic

Hi

Harsh

, did you have time to try Disputes indexing and the search service?
I've noticed that, because of multiple rows in the 2nd view, some fields in results have multiple values. I made changes to the indexer (attached modified file) to prevent that. I'm also sending you a sample output for the same query as in the previous post.

Talk to you tomorrow.

disputes-search-response2.txt 6.76 KB • Download

SampleDataSource.cs 22.4 KB • Download

Sep 15, 2020 at 7:39 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We will look into this today and will update you.

Sep 16, 2020 at 1:22 AM Notified 11 people

Shrinivas Sambhare

Hi

Radomir

,

SearchController.cs 22.6 KB • Download

SearchResponse.cs 759 Bytes • Download

Here is updated files for Search Response Model and Search Controller.

let us know if anything.

Sep 17, 2020 at 6:09 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

As discussed in today's call, We need only 2nd View indexing result to bind the data in model.

The stored procedure we used just to get the DisputeId from filtered data and pass that DisputeId value to 2nd View. so we need 2nd view data which contains all columns which we need to bind in our model.

Hope this is fine and let us know once you changed the service.

Sep 17, 2020 at 6:13 AM Notified 12 people

Radomir Mladenovic

Hi

Harsh

I made changes in accordance with today's talk. You can find modified files at
https://www.dropbox.com/sh/htv0bhk4sea6n0p/AAB7HzodCDz1N6aKf0W-1NGga?dl=0

Note that the dispute index config has changed as well.

To the appsettings.json new parameter was added for DisputeDocsIndex

You can see sample results output JSON in disputes-response-sep17.txt

Hope that's what you need. Let me know if you have any question.

Sep 17, 2020 at 9:02 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We have following operators which we need to include service while we pass filter data in json file.

is set
is not set
is equal to
is not equal to
is after
is before
is between
is not between
starts before
starts after
ends before
ends after
lasts more than
lasts less than
is greater than
is less than

Let us know once you updated service. Also, Provide example of JSON file which contains all above operators.

Sep 21, 2020 at 12:30 PM Notified 12 people

Radomir Mladenovic

Hi @Harsh,

All these operators do not exist in dtSearch as is. You need to transform them to a combination of "equals" and boolean statements.

For example, for "is set" and "is not set", you could add an index field (column to your view) as "field_59_set" with value "Y" (when the value is set) and "N" (when the value is not set). Then, if you need "field_59 is not set", send search for "field_59_set = N".

As for operators after/before/between, I guess you need them only for dates, correct? First, you need to change the formatting of dates in your view to the "YYYYMMDD" format. The way dtsearch date range query filter like is: xfilter(word "datefield::20020101~~20020131")
https://support.dtsearch.com/webhelp/dtsearchCppApi/File_Conditions.html

Filters containing starts/ends/lasts probably refer to a specific field. You will need to specify that field in the query, combining with supported operators.

Operators "greater than" and "less than" do not exist as such. What are you comparing? What's the type of the field this applies to?

Sep 21, 2020 at 9:50 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

I think we need to discuss above points on call. Also, we have following queries in JSON file which pass to filter Dispute & Document Data.

We have following 2 question for JSON request.
1) If we not pass any operator in JSON then API give us error of invalid operator
2) we need to confirm the JSON file for Add Another Rule from your side.

We are available to discuss above all points 10:30 AM to 6:00 PM IST.

Please let us know your convenient time to discuss.

Sep 22, 2020 at 6:55 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

As you need all details, Here We have attached ZIP file which contains 4 screen shot with different kind of filter data we set.

And based on this filtered data we make json file which you can see in ZIP folder.

Please look into that JSON file and provide your feedback that as per filtered data the JSON file is correct or not ?

Contegra Screenshots.zip 339 KB • Download

As you mentioned, will talk more on Thursday call 11:00 AM IST.

Sep 22, 2020 at 9:55 AM Notified 12 people

Radomir Mladenovic

Hi Harsh,

1)

{
"type":"match",
"field":"Field_62",
"Operator":"and", <<<<< this means both 234 AND 233 should be present, if that's what you wanted, it's fine
"values":["234","233"]
},

2)

{
"type":"match",
"exclude":false,
"field":"Field_347",
"values":[true], <<<< as you have a single field here, you can use the "value" field and then you don't need the Operator here. It would be ignored anyway. (Check the support JSON examples I sent you earlier.)
"Operator":"and"
},

3) The query part with Field_91 and Field_94 doesn't look good. From other items in your JSON I think you didn't fully understand the logic of the structure.

For type:boolean, the operator is used to combine clauses. For example:

{
"type": "boolean",
"operator": "and"
"clauses": [ c1, c2, c3 ]
}

will be translated to c1 and c2 and c3 where c1/c2/c3 are boolean or match elements.

According to my understanding of your UI and what you told me, JSON for this part would looks like:

{
"type":"boolean",
"Operator":"or", <<< you combine with OR the same queries for Field_91 and Field_94
"clauses":[
{ <<< a clause for Field_91
"type":"boolean",
"Operator":"or", <<< this is OR selected in your UI
"clauses":[
{
"type":"match",
"exclude":false,
"field":"Field_91",
"values":["23447","23573"],
"Operator":"and"
},
{
"type":"match",
"exclude":true,
"field":"Field_91",
"value":"23411"
}
]
},
{ <<< a clause for Field_94
"type":"boolean",
"Operator":"or", <<< this is OR selected in your UI
"clauses":[
{
"type":"match",
"exclude":false,
"field":"Field_94",
"values":["23447","23573"],
"Operator":"and"
},
{
"type":"match",
"exclude":true,
"field":"Field_94",
"value":"23411"
}
]
}
]
}

Hope this helps.

Sep 22, 2020 at 10:51 AM Notified 12 people

Radomir Mladenovic

Hi Harsh,

I made changes in accordance with our today's call.
You can find updated files at: https://www.dropbox.com/sh/wvdyic79bqey4i4/AADZen55gjB-nhPwqDRG30MVa?dl=0

1) Indexer was updated to index date fields in YYYYMMDD format. Other date fields in the view that you send as text, you need to fix on your end.

2) Search service changes:

a) Instead of using QueryStatement, send JSON via field FilterStatement. The JSON structure remains the same, with additions for date search below.

b) To search for a date range use object of the "range" type:

{
"type": "range",
"field": "datecreated",
"from": "20190919",
"to": "20210919"
}

For the BEFORE operation, also use type "range" but specify only "to" (omit the "from" field). For the AFTER operation, use "range" and "from".

c) The operator field is now optional and the default is OR as you requested.

d) To sort results use SortField with name of the field on which you sort, and SortOrder, which can have value "asc" or "desc".
Note that the sorting will be used on the "disputes" index search, and then you're getting results from the "dispute-docs" for already ordered disputes. I hope that makes sense.

Below is an example of a json structure with the new features:

{
"searchRequest": "test",
"FilterStatement": {
"type": "boolean",
"operator": "or",
"clauses": [
{
"type": "match",
"field": "datecreated",
"value": "20201001"
},
{
"type": "range",
"field": "datecreated",
"from": "20190919",
"to": "20210919"
},
{
"type": "range",
"field": "datemodified",
"to": "20190110"
}
]
},
"SortField": "language",
"SortOrder": "desc"
}

Sep 24, 2020 at 9:27 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Can we take call to discuss full text search module requirements and how we get data from contegra search?

We will take call on 6th october 11:00 AM IST.

Please confirm.

Oct 01, 2020 at 7:58 AM Notified 12 people

Radomir Mladenovic

Hi Harsh,

Yes, we can talk 6th october 11:00 AM IST.

Oct 01, 2020 at 11:04 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

and

Rob

,

We are working on Full Text Search Module. Due to one technical Challenges task, We are delay in this module.

We are ready with SQL View and requirements for FTS module.

Could we take call on next Tuesday (20th October) 11:00 AM IST to discuss and finalized the FTS module ?

We will take call on our Skype Group.

Please confirm.

Oct 15, 2020 at 12:47 PM Notified 12 people

Radomir Mladenovic

Hi Hars, Tuesday (20th October) 11:00 AM IST works for me.

Oct 15, 2020 at 6:27 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

As discussed today for FTS module, Following I have added details.

1) We are provide you Query and View. Query will use for Find the ContentTypeDataMasterId based on Filter we passed. then, We are Passing ContentTypeDataMasterId to our view to display card detail. The View is our Final Result.

2) The SQL View contains HtmlFileName, PDFFileName and ISPDFOnly Column.
ISPDFOnly : 0 -> Need to find search keyword in html files
ISPDFOnly : 1 -> Need to find search keyword in PDF files.

Server Path for PDF and HTML Files : (Server ip : 10.68.138.10)(E:\ISLGRebuildDemo\wwwroot\Documents)

3) We need to highlight the keyword in Paragraph or Page text. Also, we need hit count.

4) We have sorting and Pagination to on display cards. By default we need to display 10 cards.

Here, I have attached sample PDF file, HTML file, Query and SQL View.

IN-0126-01 - Anglo-Adriatic v. Albania - Award - C - TD (1).pdf 3.71 MB • Download

IC-0183-02 - Dispute Document- Accession Danubius v. Hungary.html 175 KB • Download

Database Name : ISLGRebuild (Server : 10.68.138.11)
SQL Query Name : FE_MetafieldwithValueDynamicFTS
SQL View Name : VW_FTSDocumentSearch

Also, As discussed Please take following SearchControl.cs file and put in your project and use this page for further work.

SearchController.cs 25.1 KB • Download

Oct 20, 2020 at 6:39 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

As discussed today, please let us know once you resolved regenerate indexing issue.

Also, please let us update on Full Text Search module? when we expect for this?

Oct 26, 2020 at 8:20 AM Notified 12 people

Radomir Mladenovic

Hi

Harsh

, I fixed the indexer to detect if index exists and update only. You can get it from OneDrive folder https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=gfh2JQ

Oct 29, 2020 at 4:54 PM Notified 12 people

Radomir Mladenovic

Harsh

I hope to have Full Text Search early next week.

Oct 29, 2020 at 5:49 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

What will do the file yo provided in following link for re-indexing issue.
https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=gfh2JQ

We are not getting that what will do with those file ? How we can resolve this ?

Nov 02, 2020 at 11:06 AM Notified 12 people

Radomir Mladenovic

Hi

Harsh

not sure what you mean. In the folder "2020-10-29 DB Indexer update" you have compiled indexer and modified sources.
The indexer checks if the index already exists and turns off "Create" flag. That was the problem that prevented indexing when the index was in use. I reproduced the problem locally and it was fixed by his change.
Hope it works for you.

Nov 02, 2020 at 12:09 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

What we need to do with those 3 files in folder ? Can we replace in TologixDBIndexer Project ?

IF yes, then if we replace those files then it gives error that Dthelper is not exists.

Nov 02, 2020 at 12:11 PM Notified 12 people

Radomir Mladenovic

Hi

Harsh

, maybe you didn't copy properly. There's DtHelpers class in the SampleDataSource.cs so not sure why would you see the error.
In any case, you can use the already compiled exe that I provided.

Nov 02, 2020 at 7:22 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

As per your comment, We have also copied SampleDataSource.cs file in TologixDBIndexer Project. The error is gone and Indexing is generated.

But, Sam problem is still occurred. We need to stop published application in IIS and then again need to generate indexes. after that result will produce. If we directly try to generate index then no result is found.

Please do needful. We are available in skype till 6:00 PM IST to discuss.

Nov 03, 2020 at 7:21 AM Notified 12 people

Radomir Mladenovic

Hi

Harsh

, let's talk tomorrow 11:30 IST about the indexing issue. I reproduced the issue you had and the new indexer worked fine for me after the fix.

Nov 04, 2020 at 7:53 PM Notified 12 people

Radomir Mladenovic

Hi

Harsh

, I started indexing FE_MetafieldwithValueDynamicFTS and noticed one issue - values ContentTypeDataMasterId is not unique in this query. Which column identifies the document/row?

Nov 04, 2020 at 10:57 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We are available to discuss FTS module and re-indexing issue. Can we conenct ?

Nov 05, 2020 at 6:22 AM Notified 12 people

Radomir Mladenovic

Hi

Harsh

, as discussed today:

1. I need you to add a unique ID to the indexed query/view. It can be row number as you suggested, but it that case you will not be able to do incremental indexing (as there's no reliable unique identifier of a document). As you said full index is generated daily, this might not be a problem so row number is fine.

2. If we generate two separate indexes for the query and the view, we might hit dtSearch limitation on the request size (I remember it was 64KB) when passing results IDs from the first index to the next one. That's why I highly recommend either:
a) Adding all data to one stored procedure as you suggested, or
b) Adding ContentTypeDataMasterId parameter to the FE_MetafieldwithValueDynamicFTS procedure, so that we can index the view first and get only related data from the proc.
That will allow us to create single index with all data.

3. dtSearch doesn't support text extraction (and highlighting) on page level. I'll think about how to meet your requirements for getting result pages and discuss this with

Rob

.

Nov 05, 2020 at 8:01 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Following, I have added discussed point as per todays call.

1) Re-indexing issue is resolved
2) As per your recommended, We have merged query and view in one Single Query for all data. so now there is no need of second view. Following, I have mentioned query name.
3) We have added Row number in query as unique identifier. (Column name RowId)

The Stored Procedure Name : FE_MetafieldwithValueDynamicFTS. You can find on ISLGRebuild databse on our server.

Now, There is no need of second view. We get all data from above mentioned query.

For Sorting, There are following 4 field you can use from query.

Relevance (default) - relevancy criteria based on current functionality of "hit count" - Depended on hit count from your side
Document Name (A-Z) - Column name (FullCitationText)
Newest First - Column name (SortingDate)(We have provided this column in yyyymmdd format)
Oldest First - Column name (SortingDate)(We have provided this column in yyyymmdd format)

As we discussed, When column ISPDF Only 1 in query, it mean we need to display Page number from PDF file and when user click on it we need to render that page under paragraph and highlight the serach word.

For Html, We can display Paragraph from html file and highlight the search keyword in that Paragraph.

Also, We have paging in this module.

Hope this fine and you get all things from our side.

Nov 05, 2020 at 12:13 PM Notified 12 people

Radomir Mladenovic

Hi

Harsh

,

In the https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=gfh2JQ
you can find a new folder with FTS indexer and the web service update.

Indexer config file indexer-config-fts.json specifies paths to two different indexes:
- IndexDir (as before) is the main documents index
- ParagraphIndexDir is for the index containing paragraph-level data.
Both indexes are created in one run.

Both indexes should be added to the web service appsettings.json as in:

    "Tologix": {
      "FullTextDocsIndex": "y:\\contegra\\tologix\\test-index\\tologix-index-fts\\",
      "FullTextParasIndex": "y:\\contegra\\tologix\\test-index\\tologix-index-fts-paras\\",
      "SubjectNavigatorIndex": "y:\\contegra\\tologix\\test-index\\subject-nav\\",
      "DisputesIndex": "y:\\contegra\\tologix\\test-index\\tologix-index-disputes\\",
      "DisputeDocsIndex": "y:\\contegra\\tologix\\test-index\\tologix-index-dispute-docs\\"
    },

The web service endpoint /api/search/fts was added for full text search requests. It's similar to what you had before, with additional paragraphs field containing a list of matching paragraphs.
See file full-text-search-example.txt for request and response payload example.

When the paragraphs is null, means that the matches are not in paragraph contents but in some other field.

Currently, for documents with ispdfonly=True paragraph number has null. This is because we're still not indexing separate pages in PDF documents so cannot provide appropriate information for this. We're still looking into finding a solution for this.

The web service endpoint /api/search/highlight-para is handling paragraph highlighting request. You need to pass an object with paraId (paragraph identifier from the search results), plus searchRequest and any applicable search control options. See BasicSearchParams class in the sources for all available options.
See file full-text-search-paragraph-highlighting-example.txt for request and response payload example.

Let me know if you have any question.

Nov 07, 2020 at 5:37 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We are doing integrate FTS contegra search in our application and We are facing issue to get the data form indexing. It throws error so we need to discuss this thing over call.

Also, We need to understand whole procedure how to get data as well paragraph list from indexing.

We are stuck here for further development.

We are available till 6:30 PM IST for all weekdays. Please ping to our Skype group to discuss this thing.

Nov 11, 2020 at 7:40 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Thanks for quick call.

As discussed, Currently, we are getting error while we are going to fetch result data from API call. Following I have added details.

at System.ThrowHelper.ThrowAddingDuplicateWithKeyArgumentException[T](T key)
at System.Collections.Generic.Dictionary`2.TryInsert(TKey key, TValue value, InsertionBehavior behavior)
at System.Collections.Generic.Dictionary`2.Add(TKey key, TValue value)
at TologixWebSearch.Controllers.SearchController.FullTextSearch(SearchModel sm) in D:\shrinivas\SVN_Project\TologicWebSearch\Controllers\SearchController.cs:line 65
at Microsoft.AspNetCore.Mvc.Internal.ActionMethodExecutor.SyncActionResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments)
at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.<InvokeActionMethodAsync>d__12.MoveNext()

Also, We need to highlight the any search word in Subject Navigator module from any field.

Hence, please let us know once you complete this.

Nov 11, 2020 at 9:22 AM Notified 12 people

Radomir Mladenovic

Hi

Harsh

, you can find fix for the FTS in the "2020-11-11 FTS update" folder on OneDrive.
I'll let you know when I have a solution for fields highlighting.

Nov 11, 2020 at 8:37 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Due to one bad news my team member Shrinivas' s Father has been passed away last week. hence, he start working on this by this week.

We are getting one error while we fetching paragraph text from highlitpara method.

Can we take call to look into this issue by next Monday in-between 10 :00 AM to 6:00 PM IST ?

Nov 20, 2020 at 11:41 AM Notified 11 people

Radomir Mladenovic

Hi

Harsh

,
Sorry to hear that. Yes, we can talk on Monday. I'll contact you on Skype as soon as I'm available in the morning (around 13:00 IST, that should be my 08:30).

Nov 21, 2020 at 6:43 PM Notified 11 people

Radomir Mladenovic

Harsh

in the OneDrive folder you can find web service update for fields highlighting. The result object will contain map "highlightedFields" with highlighted content when a match was found.

Nov 21, 2020 at 9:05 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Thanks for today call.

Does this API search the word from PDF file and provide the PDF page number and if we click on page number then it will display whole page text with highlighted search keyword ?

Could you please confirm ?

Nov 23, 2020 at 12:20 PM Notified 11 people

Radomir Mladenovic

Hi

Harsh

, I discussed in Skype, the current FTS search returns matches for PDF documents but doesn't offer page hits. It's because dtSearch doesn't allow us to get PDF text for a particular page. We need to extend the custom indexer to support this but as a part of this need a solution for splitting PDF into pages.
My understanding is that Tologix has PDF Highlighter license but some older version. I'm waiting on feedback from

Rob

and

Morgan

about this. If Tologix upgrades to the latest Highlighter Pro Edition, we can use it for text extraction on the page level.
If the upgrade doesn't happen, we'll need to research further and find some other solution for it.

Nov 23, 2020 at 4:03 PM Notified 11 people

Harsh Parikh, Tech Lead

OK Thanks

Radomir

for confirmation. currently we are going with html files and let us know once you get any confirmation or solution for pdf files.

Nov 23, 2020 at 4:06 PM Notified 11 people

Morgan Maguire, CEO

Hi

Harsh

,

Radomir

and

Rob

,

I spoke to

Rob

about this issue a few weeks ago, and told him to implement whatever solution that will provide us with the requirements we need for the PDF search results, including an upgraded PDF Highlighter license if required. So please implement your recommended solution as soon as possible, because this feature release is now several weeks past due.

Thanks,

Morgan

Nov 23, 2020 at 4:12 PM Notified 11 people

Radomir Mladenovic

Hi

Morgan

, sounds good. I'll start with PDF Highlighter integration and hope to have updated indexer in a day or two.

I have one technical question, not sure who can answer on this: Can I rely on Highlighter server having access to the network share with the documents being indexed? I can upload PDF to Highlighter wherever file is indexed but with Highlighter having access to the file share we can save time and bandwidth.

Thanks,
Radomir

Nov 23, 2020 at 6:19 PM Notified 11 people

Morgan Maguire, CEO

Hi

Radomir

,

I have no problem setting things up that way. As long as access to the network share is done securely and does not result in compromising the security of any data on our servers.

Harsh

or

Jitesh

, can you answer the question from a technical feasibility perspective, or do we need to talk to Carbon60 (our server hosting provider) about this?

Also,

Rob

, I assume we'll need to setup a new MSA to accommodate this setup?

Thanks,

Morgan

Nov 23, 2020 at 6:37 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Morgan

and

Radomir

,

As Morgan suggested if Network share securely then I also don't see any problem setting on that way.

Nov 24, 2020 at 11:56 AM Notified 11 people

Morgan Maguire, CEO

Ok. Great. Thanks

Harsh

.

Radomir

, please proceed, unless

Rob

, you think we should have a quick call to discuss?

Morgan

Nov 24, 2020 at 6:05 PM Notified 11 people

Rob Wiesenberg

Morgan

, the approach seems fine. Do you expect to continue to use the ISLG legacy product once the new version launches? The reason I ask is that we will need to determine if one than one license to PDF Highlighter is needed.

Radomir

can we serve both applications from the server where Highlighter will be running?

Nov 24, 2020 at 6:16 PM Notified 11 people

Morgan Maguire, CEO

Hi

Rob

,

Yes, we'll be maintaining the legacy application for a beta period (3-4 month). During that period, both applications will be operating in parallel to each other.

Thanks,

Morgan

Nov 24, 2020 at 6:23 PM Notified 11 people

Radomir Mladenovic

One Highlighter instance can serve both applications. We'll just need to upgrade the installation.
I can install trial version on the dev server for use with the current test content folders, and we can upgrade the production instance later.

Nov 24, 2020 at 9:49 PM Notified 11 people

Morgan Maguire, CEO

Ok. Sounds good

Radomir

. We'll install the new version of Highlighter on https://dev.investorstatelawguide.com/, confirm that it is working as required, and perform the install on https://www.investorstatelawguide.com/.

Harsh

, does that work you?

Thanks,

Morgan

Nov 24, 2020 at 10:27 PM Notified 11 people

Radomir Mladenovic

Morgan

Harsh

I have PDF page indexer ready but need to install or update PDF Highlighter. It looks like the server 10.68.138.10 is running production Highlighter, correct?
I don't have access to other server (except the SQL Server) on which I could install Highlighter for development/testing. Do you want me to upgrade the production instance instead? There will be some downtime though (15-130 minutes until I migrate the config). Please let me know.

Nov 25, 2020 at 8:37 PM Notified 11 people

Radomir Mladenovic

I made a typo... 15-30min

Nov 25, 2020 at 8:37 PM Notified 11 people

Morgan Maguire, CEO

Hi

Radomir

,

Harsh

or

Jitesh

should probably confirm, but yes, the application run on server 10.68.138.10; however, this server is used for both the production (https://www.investorstatelawguide.com/) and development (https://dev.investorstatelawguide.com/) environments. Does that mean we're using the same instance of PDF highlighter for both environments?

Note if it's going to require downtime on the server that will affect the production environment, we should perform the install at between 2:00am and 3:00am Eastern Time (North America) on a Friday or Saturday evening to minimize disruption to users.

Thanks,

Morgan

Nov 25, 2020 at 9:02 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The all ISLG applications are running on server 10.68.138.10. The development environment application is (https://www.investorstatelawguide.com/) and Production environment (https://www.investorstatelawguide.com/).

You can perform the highlighter upgradation between 2 to 3 AM Eastern Time (North America) on a Friday (it mean tomorrow 27th November).

Please make sure it will be not affected the production application https://www.investorstatelawguide.com/)

Also, Let us know once you completed the upgradation and server will be restarted so we can check all running application from server 10.68.138.10

Nov 26, 2020 at 10:08 AM Notified 11 people

Morgan Maguire, CEO

Hi

Radomir

,

I discussed the issue above with

Harsh

earlier today, assuming that you agree that this risk to production environment FTS is low (i.e., that making the updates to the PDF Hit Highlighter software will not affect the FTS in the subscriber side of https://www.investorstatelawguide.com/), please proceed with making the updates on server 10.68.138.10 during the window on Friday or Saturday this weekend.

Please confirm when the updates will occur, and

Harsh

and

Ketan

will ensure that someone will be available to test the application after the updates are made to ensure there are no issues.

Thanks,

Morgan

Nov 26, 2020 at 7:26 PM Notified 10 people

Radomir Mladenovic

Hi

Morgan

,

Harsh

,

I can do the upgrade during my Saturday morning (which should be around 2am your time). I'll send a message to

Harsh

and

Ketan

so they make sure the production is working as expected.

FYI, I'll be on a road from Saturday afternoon to Sunday evening so during that time will not be available. If you think it's too risky to to upgrade before my trip, we can leave it for my Monday morning (which should be around 2am in US and still gives Harsh and the team time to verify the installation and we have enough time for any corrections before US work hours).

Please let me know what do you prefer.

Thanks,
Radomir

Nov 27, 2020 at 6:25 PM Notified 10 people

Morgan Maguire, CEO

Ok. Sounds good,

Radomir

. Let's proceed with the upgrade on Saturday at 2am. I will send out a calendar invite to remind everyone.

Harsh

and

Ketan

, please take note that someone will need to be available to check the application (including the FTS on https://www.investorstatelawguide.com/) is functioning as required on Saturday between 12:30pm and 1:30pm Ahmedabad time.

Thanks,

Morgan

Nov 27, 2020 at 6:32 PM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Please upadte us once you complete the process on server so we will check all aaplications.

Nov 28, 2020 at 4:09 AM Notified 10 people

Radomir Mladenovic

Hi

Harsh

, PDF Highlighter has been upgraded. Please test and let me know if you notice any issue.

Nov 28, 2020 at 7:12 AM Notified 10 people

Harsh Parikh, Tech Lead

Ok

Radomir

. I will check and provide the feedback.

Nov 28, 2020 at 7:14 AM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

and

Morgan

,

I have checked all application as well FTS module in both investorstatelawguide.com and dev.investorstatelawguide.com and all are working fine. Also, the highliter is working in FTS module.

Nov 28, 2020 at 7:25 AM Notified 10 people

Radomir Mladenovic

Hi

Harsh

, I'm trying to test new FTS indexer but looks like there's no content currently in test database where IsPdfOnly=True. Could you please add some sample PDF content?

Nov 28, 2020 at 7:32 AM Notified 10 people

Morgan Maguire, CEO

Great. Thank you

Harsh

and

Radomir

. I've tested th the FTS as well, and everything appears to be working normally.

Yes,

Radomir

. There are no documents on rebuild.islg; however, there should be some sample content on rebuilddemo.islg.

Thanks,

Morgan

Nov 28, 2020 at 3:05 PM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We have uploaded test documents for ISPDFOnly=True on our database ISLGRebuild on server 10.68.138.11.

Also. The PDF Files have stored in E drive on server (10.68.138.10) on Following Path.

E: Drive : ISLGRebuildDemo\wwwroot\Documents\PDFFiles

Nov 30, 2020 at 4:59 AM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Currently, 2 things are pending in FTS module from your side.

1) When we click on any paragraph from html file then we are displaying html parapgh text below the paragraph but, we also need to highlight search keyword inside that paragraph which is not currently working.

2) We need to display Page number when ISPDFOnly = True and when user click on Page number we need to display whole page text and highlight the search keyword

Can we take call tomorrow on skype in-between 10:00 AM to 6:00 PM IST ?

Please confirm

Nov 30, 2020 at 10:47 AM Notified 10 people

Radomir Mladenovic

Hi

Harsh

,

Paragraph extraction was implemented for the HTML file format you send me and highlighting for that worked fine on my end. This HTML had:

<div> elements with class "paraBullet" containing paragraph number, and
elements with class "ParaText" containing paragraph text.

However, you've added HTML files in a different format. Please check file "OTI-0004 - Vienna Convention on the Law Treaties-1.html" for example. The above mentioned classes are used in a different way and paragraphs cannot be extracted using the same logic.

Indexer was breaking because it could not parse HTML files in unsupported format. I believe that's the reason why paragraph highlighting didn't work for you. I made a change to the indexer so that invalid file is skipped and error logged. Now, at least valid HTMLs will be handled.
I also managed to test PDF page extraction and highlighting so that's working as well.

In order to extract paragraphs from HTML we need consistency in the HTML formatting. Please, let me know which format you're going to use. If there are multiple different formats, we need to support them all. It would be great if you could provide us with all format details. If you don't have this information, I'm afraid we'll have to do it one by one, analyzing error logs and the actual content.

In the OneDrive folder you can find updated indexer and FTS indexing configuration. Notice new config parameters PdfHighlighterUrl which is required for PDF page extraction.

Nov 30, 2020 at 11:48 PM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We are looking into above things. meanwhile we should take call tomorrow between 1 PM to 6 PM IST. so we both will remian on same page.

we also need to discuss regrding html format.

Please confirm for tomorrows call.

Dec 01, 2020 at 1:42 PM Notified 10 people

Radomir Mladenovic

Hi

Harsh

, let's talk tomorrow (Wednesday) at 1:30PM IST.

Dec 01, 2020 at 10:39 PM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

After today's call, we have started looking into all functionalities which we devloped for FTS.

1) Data Binding
2) Pragraph Listing
3) Sorting
4) Pagaing of Document Card

But somehow, The paging of Document card is not working. When we click on 2nd page then same data of 1st page is rendering.

had we missed something for paging ? Can we take call tomorrow again 1:30 PM IST to resolve this issue ?

Please confirm.

Dec 02, 2020 at 2:19 PM Notified 10 people

Radomir Mladenovic

Hi

Harsh

, I'm sending you a fix for the pagination. Note that pagination was enabled now for Disputes and Subject Navigation as well. If this is not desired for these collections, you can change the last parameter in call to CreateResponse to include all results. In that case, let me know to update my copy as well.

SearchController.cs 31.3 KB • Download

Dec 03, 2020 at 5:33 AM Notified 10 people

Harsh Parikh, Tech Lead

OK

Radomir

Thanks. Will implement this thing and will let you know if we find any issue or error.

Dec 03, 2020 at 5:37 AM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

How we can set Rebuild Project Document Path in highlighter folder ? Currently in application.conf the live existing path is already set.

How we can set Rebuild Project document path also on server 10.68.138.10 and which highlighter URL will use ?

Please give answer as early as possible as we have release by tomorrow for UAT.

Dec 12, 2020 at 8:39 AM Notified 10 people

Radomir Mladenovic

Hi

Harsh

,

First, do you want to use the same Highlighter instance or not?
What you could do is installing another Highlighter instance on a different path and different port.
If using the same instance, you could use a different path prefix for the Rebuild Project and in folder mapping settings point Highlighter to a different folder for that prefix.

If you're using the same instance, you need to use the same Highlighter URL. Otherwise, you need to setup another proxy on IIS to point to another instance's port.

Dec 12, 2020 at 6:23 PM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Can we take short call on tomorrow or monday between 10 AM to 6 PM IST? As we are not as much aware about highliter setup on server and we dont want to take any risk for current live islg application.

Please confirm.

Dec 12, 2020 at 6:27 PM Notified 10 people

Radomir Mladenovic

Hi

Harsh

, let's talk tomorrow 1:30 PM IST.

Dec 12, 2020 at 7:15 PM Notified 10 people

Harsh Parikh, Tech Lead

Ok Thanks

Radomir

. will take call on tomorow (13th December) 1:30 PM IST on skype.

Dec 12, 2020 at 7:20 PM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Hope you are doing well.

We are start looking into the FTS module bug and for 2 bug we performed following scenario.

Scenario 1 :

Enter only keyword test
Click on Search button
The API provides the result with data where language display English

Scenario 2 :

Enter only keyword test
Apply Filter language English
Click on Search button
The API doesn't provide any data

We are available 10:00 AM to 6 :00 PM IST from next week.

Please confirm.

Jan 01, 2021 at 10:25 AM Notified 10 people

Radomir Mladenovic

Hi

Harsh

, the indexer logs debug details with all data fields indexed - the debug was enabled by default so you should have this in the log. Please, make sure the value that was indexed for the record is the one you use in the filter.
If you cannot see the error, please send me the log, index and the search payload you;re using and I'll try to reproduce.

Jan 04, 2021 at 10:01 AM Notified 10 people

Ketan Sondarva, Technical Project Manager

Hi

Radomir

,

I would suggest to take a short call so we can show errors raised by Industrial team and then you can do further communication via this channel.
Let me know your availability for tomorrow as Harsh has already sent you mail to meet once to understand issues and get solution asap.

Thanks,
Ketan Sondarva

Jan 04, 2021 at 10:04 AM Notified 10 people

Radomir Mladenovic

Hi

Ketan

, ok let's have a call 5PM IST today

Jan 04, 2021 at 10:11 AM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We have another call at 6:00 PM IST. Can we take call at 4:00 PM IST today ? It mean after 15 minutes.

Or we can take call by tomorrow between 11:00 AM to 4:00 PM.

Please confirm.

Jan 04, 2021 at 10:15 AM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

As discussed in today's call, Following I have provided you database name and Documents path from server so you can test it.

Server IP : 10.68.138.11
Database Name : ISLGRebuildFTS

Server IP : 10.68.138.10
Documents Path : E:\ISLGRebuildDemo\wwwroot\DocumentsFTS

From above path you will find all html & pdf documents from server.

I will also post you all bugs which industrial team raised.

Jan 05, 2021 at 8:29 AM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Following, I have mentioned the bugs description and issue which Industrail team raised.

1 ) issue no : 20579 - Search results with no pinpoint references

Steps to reproduce:

Go to full text search
type 'interpretation' in the search input field
submit your search

Result: I am taken to a search results screen with 5 document results. Only 1 document result contains a pinpoint reference.

Expected: For each document result there must be at least one pinpoint reference for the keyword appearance in the text.

2) issue no : 20572 - FTS > Cannot perform search with filters

Steps to reproduce:

Go to Full Text Search
Make a selection in the 'Language' field (ie: French)
Scroll down and select [Submit search]

Result: I am taken to the search results page. No results or message is presented.

Expected: I expect the see the documents with my selected language as a result for my search. The documents are displayed if i do not apply a filter.

3) issue no : 20586 - Search with basic filter does not work

Steps to reproduce:

go to full text search
Type the term 'test' in the search input
Select 'English' from the language field
submit the search

Result: I am taken to the search results page. Some of the search results are in not documents in 'English'

Expected: based on my applied filters I must see only documents containing the word test + with language = English

4) issue no. 20611 - Search > Not able to search without inputting a keyword

Steps to reproduce:

Go to FTS
Add any filtering parameters but not a keyword and press "search"

Result:Leads to an empty page with no results

Expected:Will see any results that correspond to the filtering parameters

5) issue no. 20612 : Search > Boolean search not working

Steps to reproduce:

Go to FTS
Set search type to "boolean"
Enter yellow OR blue in the keyword search and press search

Result:no results found

Expected:Will see results that correspond to the "blue" keyword

6) issue no. 20613 : Search > Fuzzy Typo not working

Steps to reproduce:

Go to FTS
Enter term "blun" into keyword search
Set fuzzy typo to include 1 letter
enter search

Result: No results found

Expected: Should see results that correspond as if I entered "blue"

Please let us know if you need anything else.

Jan 05, 2021 at 8:35 AM Notified 10 people

Radomir Mladenovic

Harsh

, I made a change to the SearchController to accept searches without keywords.
The update is in the OneDrive folder "2021-01-06 FTS update"

2) issue no : 20572 - FTS > Cannot perform search with filters

Should be fixed with the update. For the index you sent me, the following test search returns 1 result:

{
"searchRequest": "",
"FilterStatement": {
"type": "boolean",
"operator": "or",
"clauses": [
{
"type": "match",
"exclude": false,
"field": "language",
"value": "French"
}
]
},
"SortField": "sortingdate",
"SortOrder": "desc",
"PageSize": 50,
"PageNum":0
}

3) issue no : 20586 - Search with basic filter does not work

I don't see any issue here - I'm getting only results with English.

My search payload:

{
"searchRequest": "test",
"FilterStatement": {
"type": "boolean",
"operator": "or",
"clauses": [
{
"type": "match",
"exclude": false,
"field": "language",
"value": "English"
}
]
},
"SortField": "sortingdate",
"SortOrder": "desc",
"PageSize": 50,
"PageNum":0
}

or you can send it as:

{
"searchRequest": "test",
"FilterStatement": {
"type": "match",
"field": "language",
"value": "English"
},
"SortField": "sortingdate",
"SortOrder": "desc",
"PageSize": 50,
"PageNum":0
}

In both cases I get 4 results, all English.

I suspect there's some issue with the structure of your request. Please send me the payload you're submitting to the service.

4) issue no. 20611 - Search > Not able to search without inputting a keyword
This looks like a duplicate of #2, should be working now.

5) issue no. 20612 : Search > Boolean search not working

Looks good to me. The following search returns 5 results for be, highlighting both "blue" and "yellow":

{
"searchRequest": "yellow OR blue",
"searchType": "Boolean",
"SortField": "sortingdate",
"SortOrder": "desc",
"PageSize": 50,
"PageNum":0
}

Please send your request payload.

6) issue no. 20613 : Search > Fuzzy Typo not working

Looks good to me. The following search returns 3 results, highlighting "Blue":

{
"searchRequest": "blun",
"Fuzzy": true,
"SortField": "sortingdate",
"SortOrder": "desc",
"PageSize": 50,
"PageNum":0
}

Please send me your payload.

I run all tests with the index you sent me.

I'm still investigating #1 and get back to you later on it.

Jan 06, 2021 at 12:07 AM Notified 10 people

Radomir Mladenovic

1 ) issue no : 20579 - Search results with no pinpoint references

As discussed before, we need a better description of the HTML format from you. This issue is related to that.

The test file you sent us initially was simple:

<div> element with the paragraph number had "paraBullet" class, and
the following <div> element with the paragraph test had "ParaText" class

When you search for "interpretation", one of the documents without paragraphs highlight is "OTI-0082 - ILC Draft Articles on Diplomatic Protection (2006).html". Please, check this document and provide details how to extract paragraphs from it. To me, this document looks messy even visually - deep nesting, repeating paragraph numbers, etc.

To successfully extract paragraph content we really need description of the format(s) you're using. Get not the simplest, but the most complex examples of content to be handled, and explain how to get data from it - what makes a paragraph.

Thanks,
Radomir

Jan 06, 2021 at 1:25 AM Notified 10 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Here, I have attached description our html coding manual. The html structured defined by this coding manual.

In this document we covered all the scenarios. so please look into this manual for html description.

ISLG HTML Coding Manual_15072020.docx 1.45 MB • Download

Jan 06, 2021 at 10:06 AM Notified 11 people

Radomir Mladenovic

Hi

Harsh

,

Thanks for the document. From what I can tell, it focuses mainly on the visual aspects, not much info about extracting paragraphs data.

1. Check this screenshot from the above mentioned sample document. Two paragraphs are marked with (3), on the same level. My assumption was that paragraph numbers are unique. How should this be represented in index?

image.png 160 KB • Download

2. There's text from page footers, marked with class "pdffootnote", will be indexed as a part of the HTML. However, as far as I can say, this doesn't belong to any particular paragraph, so there will be no paragraph in search results for matches in this content. Makes sense?

3. If you have information which text belongs to which paragraph, can you add additional data attributes to content (e.g. similar to "data-key" attributes that exist in the HTML) to make text extraction simple? For example, adding "data-para" attribute to a text div, where value is the paragraph number, would make the extraction way simpler.

Thanks,
Radomir

Jan 08, 2021 at 8:30 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Piyush

and

Jitesh

,

Could you please provide answer to

Radomir

for above questions.

Jan 08, 2021 at 8:32 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

For 2nd Question, You are right the second Paragraph no. 3 as per screenshot are footnote and we need only Paragraph for FTS search result. We don't need footnote. But, I need confirmation from Morgan.

Morgan

Please confirm.

For 3rd Point

Radomir

, We can't do anything now with html structure as we have converted all the PDFs document into html by algorithm and manually. so it is not possible to set any attribute in html files.

Jan 08, 2021 at 8:38 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Actually, You are looking html was generated by old algorithm. here, I have attached same html where ID is generated uniquely. so you can get idea from this html file.

OTI-0082 - ILC Draft Articles on Diplomatic Protection (2006).html 306 KB • Download

Jan 08, 2021 at 9:17 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Could you please send one drive link as we are unable to open the link. and please provide the bug no. which you resolved.

And please provide what we need to do ? Can we replace search controller from one drive ?

We are using following link but it will be not open.

https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=gfh2JQ

Is it possible to send only Search Controller file so we can replace it ?

Please provide One drive link.

Jan 08, 2021 at 9:22 AM Notified 11 people

Radomir Mladenovic

Harsh

One Drive link is https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=2MNKXq

As of issue numbers, the only issue I see is with HTML parsing. Please check my previous comments where I listed your bugs numbers and search payload I used to test.

Jan 08, 2021 at 10:43 AM Notified 11 people

Morgan Maguire, CEO

Hi

Harsh

and

Radomir

,

Responding to

Harsh

's response above: Re: Search Implementations - TOLOGIX - Contegra Search Audit, please note that all text in the documents needs to be indexed within the FTS, including footnotes, headings, TOC, etc. We need to ensure a result is displayed no matter the context of the text, similar to what it currently does for PDFs in the legacy application.

Thanks,

Morgan

Jan 08, 2021 at 5:27 PM Notified 11 people

Radomir Mladenovic

Harsh

I'm trying to make sense of the document "OTI-0082 - ILC Draft Articles on Diplomatic Protection (2006).html " that you sent me on Jan 8.
I'm sending you paragraphs extracted from this document. The first column is the paragraph number, the second column is the paragraph test. Please review and correct what needed.

test-new-extracted.html 153 KB • Download

In this document I didn't include footnotes. The plan is to add each footnote to the paragraph that references it. However, we need to make a proper paragraph extraction first.

My assumption was that the paragraphs will have unique numbers withing the document. However, that doesn't appear to be the case. Check, for example, paragraph with number (1) - it appears in multiple parts of the document. That means in search results we should have multiple result paragraphs (1) for the same document, correct?

Jan 12, 2021 at 11:44 PM Notified 11 people

Radomir Mladenovic

Harsh

Morgan

Any update/feedback on my question above about the paragraph extraction?

Jan 18, 2021 at 1:27 PM Notified 11 people

Morgan Maguire, CEO

Hi

Radomir

, I'll let

Harsh

respond to this one. Note that

Harsh

was on leave Wednesday, Thursday and Friday last week.

Morgan

Jan 18, 2021 at 9:07 PM Notified 11 people

Harsh Parikh, Tech Lead

Jitesh

and

Piyush

, Could you please provide your feedback on

Radomir

's question regarding html structure.

Jan 19, 2021 at 10:51 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

There is one question for you. When Indexing will run every day then ParaId parameter will be change for same document ?

If yes then it will create an issue for us because we have bookmark functionality in application where user can bookmark any searched pinpoint paragraph.

We are using ParaId to get the HTML or PDF text.

For example, any user search the keyword and bookmark paragraph 8 in application.

When we saved the bookmark Paragraph 8 then html text is display through API. But if we check next day after regenerate the indexing the ParaId parameter which we saved is changed.

We should same ParaId for all time to fetch the HTML or PDF text.

Please provide your feedback. Also, We are available to discuss on Skype between 11:00 AM to 6:00 PM IST.

Jan 19, 2021 at 11:00 AM Notified 11 people

Radomir Mladenovic

Harsh

correct, paraId might change with index update. It's not a good candidate for use in bookmarks. I'd suggest using real paragraph number, but that also makes sense only if paragraphs are unique in the document.
Depending on feedback I get on paragraph extraction questions, we'll probably need to change indexing and how paragraphs are referenced in the index. I'd wait on these answers before making further changes.

Jan 20, 2021 at 10:57 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

I talked with html team and as per your html provided above the Paragraph number 1's ids are different.

1st Paragraph Id : pa1
2nd Paragraphed : pa1.1

so if any user search the keyword and it will match in both paragraph then we should display both paragraph number in application.

Morgan

, Hope I am Correct.

Jan 20, 2021 at 11:01 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We should stop to update the ParaId while indexing are regenerate because we are using paraid in bookmark functionality to retrieve html text.

Jan 20, 2021 at 11:03 AM Notified 11 people

Morgan Maguire, CEO

Hi

Harsh

and

Radomir

,

Following up on the above, yes, it will be common in certain documents for the same paragraph number to be used across the same document, particularly within Treaties and Arbitration Rules. This is why paragraph IDs need to be used as the unique identifiers for paragraphs and footnotes across a document.

For example, within ARB/0029 - ICSID Arbitration Rules (2006) you see that each rule contains subparagraphs that are numbered according to conventional bulleting where the numbering restarts under each rule. As a result, to ensure each subparagraph can be uniquely identified, we have inserted references to relevant rule or section into each paragraph ID:

image.png 123 KB • Download

Therefore, as

Harsh

has already pointed out, paragraph IDs need to be used for indexing purposes.

Radomir

, will using paragraph IDs cause a problem for the indexing? Why aren't paragraph IDs good candidates for bookmarks?

Thanks,

Morgan

Jan 20, 2021 at 12:43 PM Notified 11 people

Radomir Mladenovic

Thank you

Morgan

, that's a useful explanation.

ParaId that I said is not convenient for bookmarking is not your paragraph ID. It's dtSearch document number which we used for fast retrieval of found paragraphs. For bookmarking, you should use your paragraph ID and I'll look into changing web service to support data retrieval using this id.

How should footnotes be indexed and referenced? Should they be indexed with a paragraph referencing them, or separately?

Jan 20, 2021 at 1:02 PM Notified 11 people

Morgan Maguire, CEO

Sounds good,

Radomir

.

The footnotes should be indexed separately without the paragraph that references them. In other words, we want the user to be directed to the text of the footnote itself if that produces a hit for the searched keyword. Does that make sense?

Morgan

Jan 20, 2021 at 1:16 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We are using same ParaId for retrive text from HTML & PDF in bookmark which we use in FTS module.

So my assumption is you will change this in your web service and it will work in both FTS and bookmark. There is no need to change anything in application.

Please note that we save the URL in database where we used ParaId Parameter for bookmark.

Jan 21, 2021 at 8:12 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

There are 2 bugs produced by Industrial Team. Please look into this.

To check both issues you can use ISLGRebuild database on server. and all PDF & Html documents you can find from following path : E:\ISLGRebuildDemo\wwwroot\Documents

21136 : Search > All words > no pinpoint references

Steps to reproduce:

go to FTS
enter the terms 'greek and september or tribunal' into the search field
select 'All words' as the search type
submit the search

Result: Search results cards do not include pinpoint references

Expected: A paragraph link must exist for each paragraph in the document where at least one of the keywords from the search was found

image.png 323 KB • Download

21138 : Search > Keyword not highlighted

Steps to reproduce:

Go to FTS
eneter the term 'like' into keyword input
submit search
from search results, find a result with pinpoint references
Select a pinpoint reference to preview excerpt

Result: No keyword is highlighted in the result. The keyword does not seem to appear in the paragraph.

Expected: Only paragraphs with keyword appearances will be available as pinpoint references to preview excerpts. The keyword must be highlighted in the excerpt preview.

image.png 317 KB • Download

Jan 22, 2021 at 10:26 AM Notified 11 people

Radomir Mladenovic

Hi

Harsh

,

I made changes to the HTML parser. To see how extraction worked for your sample file, check:

test-new-extracted.html 180 KB • Download

On the OneDrive (https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=2MNKXq) under the folder "2021-01-23" you can find updates for the indexer and web service.

Before indexing with the new indexer, delete FTS index folders completely - new index is needed as there have been some changes in the structure.

In the results, paraId is now a string, not a number, and it looks like this:

image.png 15.6 KB • Download

In the paragraph highlighting request you also need to send this id:

image.png 7.16 KB • Download

Let me know if you have any question.

As of the bugs you sent:

21136 "A paragraph link must exist for each paragraph in the document where at least one of the keywords from the search was found"

This is new requirement for me. We'll need to parse complex queries that contain multiple terms and boolean expressions to extract keywords only and then find paragraphs. I'll let you know when this is ready.

21138 I'm not getting any results for "like" so cannot reproduce this.

Jan 23, 2021 at 6:37 PM Notified 11 people

Radomir Mladenovic

I updated the service with a quick fix for bug 21136, let's see if that helps.

Jan 23, 2021 at 6:49 PM Notified 11 people

Harsh Parikh, Tech Lead

OK Thanks

Radomir

. We will check this thing by Wednesday and will update you as Tomorrow we have national holiday.

Jan 25, 2021 at 6:18 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We need to discuss following bugs with you on skype call. Please provide your confirmation. We are available between 11:00 AM to 6:00 PM IST.

21138 Search > Keyword not highlighted

Steps to reproduce:

Go to FTS
eneter the term 'like' into keyword input
submit search
from search results, find a result with pinpoint references
Select a pinpoint reference to preview excerpt

Result: No keyword is highlighted in the result. The keyword does not seem to appear in the paragraph.

Expected: Only paragraphs with keyword appearances will be available as pinpoint references to preview excerpts. The keyword must be highlighted in the excerpt preview.

image.png 274 KB • Download

21217 : Preview excerpt > must not be displayed if not keyword was included in query

Steps to reproduce:

Go to FTS
Submit a search with no keyword

Result: Search results include pinpoint references to preview excerpts

Expected: No preview excerpts should be available because I have not included a keyword in my search query

image.png 342 KB • Download

21218 : empty results appearing in search

19515 : Any words search > no highlight (In Subject Navigator Index)

Steps to reproduce:

Go to the subject navigator
expand the search options accordion below the search bar
select 'All words' option
in the search bar enter the terms 05 bridgestone
hit enter

Result: The search is performed and the tree is filtered to display only matching branches (and parent branches of). No matching terms are highlighted in the results.

Expected:

terms that match the search will be highlighted within the results

image.png 353 KB • Download

20613 : Search > Fuzzy Typo not working (In Subject Navigator Index)

Steps to reproduce:

Go to FTS
Enter term "blun" into keyword search
Set fuzzy typo to include 1 letter
enter search

Result:No results found

Expected:Should see results that correspond as if I entered "blue"

20867 : SN > Fuzzy typo not working (In Subject Navigator Index)

Steps to reproduce:

Go to SN
Set search setting for fuzzy typo to 2
type in "awurd"

Result:No results found

Expected:Should see results for "award"

Jan 28, 2021 at 7:11 AM Notified 11 people

Radomir Mladenovic

Hi

Harsh

, are these issues reproducible in the database I can access to? Please let me know which database and documents path to use.

We can talk tomorrow about 13:30 IST but please provide me with the examples to test today. There's no much sense is having a call if I have no access to test data.

Thanks,
Radomir

Jan 28, 2021 at 9:25 AM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

You can check those bug on database : ISLGRebuild (server ip : 10.68.138.11)
The Document Path : E:\ISLGRebuildDemo\wwwroot\Documents

We will discuss more by tomorrow 1:30 PM IST.

Jan 28, 2021 at 10:06 AM Notified 11 people

Radomir Mladenovic

Hi

Harsh

,

21217 : Preview excerpt > must not be displayed if not keyword was included in query

This was intentional as you didn't specify desired behaviour when there is not search query. On OneDrive (https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=2MNKXq) under the folder "2021-01-28" you can find update web service that fixes this.

Other issues that you mention I could not reproduce as I couldn't find the keywords you mention in the index. I guess we're still indexing different data?

For tomorrows call, please prepare to send me indices for FTS and subjectnav, as well as indexing logs created indexing one and the other collection.

Jan 28, 2021 at 9:26 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

As discussed in today's call, Here I have attached 2 zip Folder.

1) Full Text Search (Include Indexes folders & indexlog File)
2) Subject Navigator (Include Indexes folder & indexlog File),

FullTextSearch.zip 251 MB • Download

SubjectNavigator.zip 1.46 MB • Download

Also, I have attached JSON format which we passed to Webservice.

Fuzzy Type Bug :

{"searchRequest":"blun","SearchType":"3","Stemming":false,"Synonyms":false,"Fuzzy":true,"Fuzziness":"1","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"DocumentContentTypeId","values":["37","13","12"]}]},"PageNum":0,"PageSize":20}

Highlight issue with like keyword

{"searchRequest":"like","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"DocumentContentTypeId","values":["37","13","12"]}]},"SearchType":"3","Stemming":false,"Synonyms":false,"Fuzzy":false,"Fuzziness":"1","SortField":"hits","SortOrder":"desc","PageNum":0,"PageSize":"20"}

Jan 29, 2021 at 11:23 AM Notified 11 people

Radomir Mladenovic

Hi

Harsh

,

1) Your FTS index was not created with the latest update that I sent you on Jan 23. I can tell that by "#para_" appearing in the paraId string. I also sent you sent you source code update for it so you can see for your self. The update was also trimming extra whitespace that appears in some HTML element IDs. As your index was not created with the latest update, there's no much sense testing it.

I've noticed in your index that some paragraph IDs even use symbols (I saw some kind of dot). I'm not sure if dtSearch will properly work finding those. To prevent issue with these, I made a change to the indexer to encode both file path and paragraph.

Please, take the indexer update from OneDrive folder 2021-01-30, delete old FTS indexes and re-index!
https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=2MNKXq

2) You cannot search for keyword "like" specifically because it's a stop word, listed in the file noise.dat (dtsearch default list). Review that file and remove all keywords you want to use in search - or delete the file.
After that do full re-indexing!

3) Why do you think that "blun" should find "blue" with Fuzziness=1? Did you find this example in the dtSearch documentation? I see that it finds it when Fuzziness=2 so I'd say that fuzzy search works.

4) Subject Navigation highlighting for more than one term works for me:

image.png 59.4 KB • Download

5) Why do you send "SearchType": "3" in your search request? Are you sure this binds to the desired search type?

Jan 30, 2021 at 10:04 PM Notified 11 people

Harsh Parikh, Tech Lead

Hi

Melissa

and

Savannah

,

Please see following answer of

Radomir

for Fuzzy typo bug : (Bug No: 20867 & 20613)

Why do you think that "blun" should find "blue" with Fuzziness=1? Did you find this example in the dtSearch documentation? I see that it finds it when Fuzziness=2 so I'd say that fuzzy search works.

Alos,

Melissa

, Please look into the

Radomir

answer for bug no : 21138

You cannot search for keyword "like" specifically because it's a stop word, listed in the file noise.dat (dtsearch default list). Review that file and remove all keywords you want to use in search - or delete the file.
After that do full re-indexing!. We will re-indexing by this week and let you know once it will be completed.

Radomir

, Could you please let us know how we can remove noise words from indexing ? We have already removed those noise words from DtSearch Desktop version which we used for old legacy application. But, We not sure for this new rebuild indexing to how to remove noise words list. Please let us know.

Feb 01, 2021 at 6:23 AM Notified 11 people

Radomir Mladenovic

Harsh

the indexer takes noise.dat from the application folder (the folder containing indexer exe). I have this file in my test folder on the server so I guess you have it as well because the same distribution was sent to you.

Feb 01, 2021 at 8:51 AM Notified 11 people

Harsh Parikh, Tech Lead

Thanks

Radomir

. We have removed all noise words from Indexer.

Melissa

, When we will release next build please take note that on following point for bug no. 21138

You cannot search for keyword "like" specifically because it's a stop word, listed in the file noise.dat (dtsearch default list). Review that file and remove all keywords you want to use in search - or delete the file.
After that do full re-indexing!. We will re-indexing by this week and let you know once it will be completed.

Feb 01, 2021 at 10:29 AM Notified 11 people

Morgan Maguire, CEO

Hi

Harsh

and

Radomir

,

Re bug no. 21138, please note that in the legacy application, we have removed all keywords from the noice.dat file. Please do the same for the new application to ensure keywords similar to "like" will generate hits.

Re bug no. 20613,

Radomir

and

Rob

could you please explain why a search for "blun" did not produce a result for "blue" when fuzzy typo = 1 was enabled? If you read through the dtSearch support page on fuzzy searching, it seems like this should have produced a hit: https://support.dtsearch.com/webhelp/dtsearch/fuzzy_searching.htm

Thanks,

Morgan

Feb 01, 2021 at 11:09 PM Notified 11 people

Radomir Mladenovic

Hi

Morgan

and

Harsh

,

I understand your question about the fuzzy search but we don't have an insight into dtSearch internal implementation. The web service passes parameters to the dtSearch API and apparently that works, considering there are results for fuzziness=2.
Unless

Rob

has an answer, you'll have to address this question to dtSearch support.

Thanks,
Radomir

Feb 02, 2021 at 8:10 AM Notified 11 people

Morgan Maguire, CEO

OK. Sounds good,

Radomir

. Unless,

Rob

has further insight, let's move on and consider bug no. 20613 resolved.

At the same time, please note my instructions on bug no. 21138 and removing all keywords from the noise.dat file.

When can we expect the FTS and Subject Navigator searches to be fully implemented within staging.investorstatelawguide.com?

Thanks,

Morgan

Feb 02, 2021 at 3:32 PM Notified 11 people

Rob Wiesenberg

Morgan

, I believe that the fuzzy matching is based on the percentage of the search term that matches term in the text. This might explain why a four letter search term with one incorrect letter does not match when the fuzziness is set to 1. Also there is an API call that uses % value. I am waiting for confirmation from dtSearch. Will let you know.

Feb 02, 2021 at 3:37 PM Notified 11 people

Morgan Maguire, CEO

Thanks

Rob

. Any clarification we could relay to users would be appreciated.

Morgan

Feb 02, 2021 at 3:42 PM Notified 11 people

Radomir Mladenovic

Morgan

Harsh

I don't think there's any unaddressed bug or unfinished functionality on my end. Please, let m know if there's anything else.

Feb 02, 2021 at 4:12 PM Notified 11 people

Morgan Maguire, CEO

Ok. Sounds good,

Radomir

.

Harsh

, let us know when the updated version of the Subject Navigator and FTS searches are deployed to staging.investorstatelawguide.com, and

Melissa

and

Naomi

can complete their UAT.

Thanks,

Morgan

Feb 02, 2021 at 5:06 PM Notified 12 people

Rob Wiesenberg

Morgan

dtsearch has confirmed that the 1-10 fuzzy designation is not based on a letter count.

The correspondence between search fuzziness with the default 1 to 10 setting and character discrepancies is not "1 to 1"

However, you can fine-tune fuzziness "manually" with the % character. Please see pages 42-43 of https://support.dtsearch.com/faq/dtSearch_Desktop.pdf for more on this.

Feb 03, 2021 at 12:04 AM Notified 12 people

Morgan Maguire, CEO

OK. Thanks

Rob

.

Melissa

and

Naomi

, we should take note of this and make the appropriate updates to the sections in the Knowledge Centre that address these options in the Full Text Search and Subject Navigator searches.

Thanks,

Morgan

Noted!

Feb 03, 2021 at 12:17 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Today, We are going to create Indexing on server. But, Don't know it is not created any module indexing.

Following, I have mentioned Server Details :

Server IP : 10.68.138.10
TologixDBIndexer : E:\DevContegraISLGRebuildStagingDBIndexer
Document Path : E:\ISLGRebuildStaging\wwwroot\Documents\
Indexing Path : E:\DevContegraISLGRebuildStagingIndexes\

Database Name : ISLGRebuildStaging (Server 10.68.138.11)

Please take note that now we are going to create indexing on migrated data. so the amount of data is large.

Please look into this as high priority as tomorrow we need to deploy the project on staging server.

Also, Here, I have attached Indexing log after tried to generate indexing.

indexing.log 93.1 KB • Download

Feb 04, 2021 at 10:31 AM Notified 12 people

Radomir Mladenovic

Hi

Harsh

> Today, We are going to create Indexing on server. But, Don't know it is not created any module indexing.

You can put indexer and index files wherever you like, the same as you did so far. I really don't understand what are you asking me to do here.

Feb 04, 2021 at 12:05 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Today we are going to Indexing of migrated data. We are using database ISLGRebuildStaging on server 10.68.138.11.

Please note that now we are using migrated data. Hence the amount of data is so large.

Here, I am attached FTS module Indexing and Indexing log file.

But, some how the indexing was not generated properly for all modules.

indexing.log 93.1 KB • Download

fts.zip 76.9 KB • Download

Feb 04, 2021 at 12:26 PM Notified 12 people

Radomir Mladenovic

Harsh

it looks like your query is loo slow so the query timed out. There's an error in the log:

2021-02-04 05:24:12,541 [1] ERROR - Failure in retrieving data from the database
System.Data.OleDb.OleDbException (0x80040E31): Query timeout expired
at System.Data.OleDb.OleDbDataReader.ProcessResults(OleDbHResult hr)
at System.Data.OleDb.OleDbDataReader.NextResult()
at System.Data.OleDb.OleDbCommand.ExecuteReaderInternal(CommandBehavior behavior, String method)
at System.Data.OleDb.OleDbCommand.ExecuteReader(CommandBehavior behavior)
at System.Data.Common.DbDataAdapter.FillInternal(DataSet dataset, DataTable[] datatables, Int32 startRecord, Int32 maxRecords, String srcTable, IDbCommand command, CommandBehavior behavior)
at System.Data.Common.DbDataAdapter.Fill(DataSet dataSet, Int32 startRecord, Int32 maxRecords, String srcTable, IDbCommand command, CommandBehavior behavior)
at System.Data.Common.DbDataAdapter.Fill(DataSet dataSet, String srcTable)
at DBIndexer.SampleDataSource.RetrieveDataFromDB() in p:\contegra\contegra-tologix\DBIndexer\SampleDataSource.cs:line 582

According to the time of log messages, looks like the query expiration is 30 seconds.

Feb 04, 2021 at 1:34 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Now, Data size increase. so you need to check all the queries and views which we provided to you and increase the time to generate indexes. Because due to large data the views nd query is taking time to return all the data.

1) Subject Navigator
2) Dispute Document
3) Full Text Search.

Feb 04, 2021 at 1:36 PM Notified 12 people

Radomir Mladenovic

Hi

Harsh

You can get the indexer update from 2021-02-04 folder on OneDrive
https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=2MNKXq

I enabled unlimited timeout for the sql command.

BTW I recommend you to edit TologixDBIndexer.exe.conf and set the default logging level to INFO. Currently, it's on DEBUG and your indexing log would be huge.

image.png 4.03 KB • Download

How long does it take to FE_MetafieldwithValueDynamicFTS to complete and start returning data? I startedthe process almost half an hour ago and still waiting....

Feb 04, 2021 at 9:58 PM Notified 12 people

Radomir Mladenovic

Documents finally started indexing after about half an hour. I've noticed some HTML parsing error related to paragraph extraction and for now made a quick fix that it doesn't break (the update is in the above mentioned folder) but extraction on paragraphs needs to be checked.

Feb 04, 2021 at 10:23 PM Notified 12 people

Radomir Mladenovic

I looked into details of one breaking HTML. The problem I see is inconsistent use of "paralvl" classes because there's nested in unexpected order - for example paralvl1 can be found as a child of paralvl2 level. But in some documents is the other way around. This causes a big issue in extracting paragraph number and not all level are properly collected (see 4th column "paraFiltered" in the attached file).

However, looking at the attached example, I think that the 3rd column ("idFiltered") makes more sense to return to user as the paragraph number. It better indicates paragraph, footnote, etc.

Morgan

Harsh

Please let me know what you think.

test-new-extracted.html 46.6 KB • Download

image.png 58.4 KB • Download

Feb 04, 2021 at 11:05 PM Notified 12 people

Radomir Mladenovic

Harsh

you can go and create indexes regardless of how we proceed with showing paragraphs. I already index both fields so we need just a change in the search service to swap them in the results.

Feb 04, 2021 at 11:11 PM Notified 12 people

Morgan Maguire, CEO

Hi

Radomir

,

I think this is really a question for

Harsh

and

Jitesh

to determine, but I do see any issues with using the idFiltered column for identifying the appropriate paragraphs. This should contain all the unique ID properties that will prevent any duplicate IDs within the same document.

Thanks,

Morgan

Feb 05, 2021 at 12:05 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The Full Text Search module indexing is created now after replace the TologixDBIndexer.exe and also we replace to IFNO in config file.

But, Subject Navigator module is not generateing the Indexing.

The Indexing log says The Table and View not found. But, We have vw_SubjectNavigatorSearch on ISLGRebuildStaging Databse.

The Index log file.

2021-02-05 02:11:45,989 [1] INFO - Execute IndexJob
2021-02-05 02:11:46,320 [1] INFO - Retrieve data from database
2021-02-05 02:12:17,008 [1] WARN - No tables/views found
2021-02-05 02:12:17,141 [1] INFO - Done

Could you please check and confirm ?

Feb 05, 2021 at 7:50 AM Notified 12 people

Radomir Mladenovic

Harsh

I have no access to your environment before the late afternoon. In the meantime, please check your environment and database access rights. Nothing changed in the indexer code that would affect finding the table. Verify the indexer config file as well.
For this case you may want to enable DEBUG log to check any other messages.

Feb 05, 2021 at 9:21 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

I have checked all things are OK. Can we take quick call to check ?

Feb 05, 2021 at 9:24 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The following 2 modules Indexing are not created and indexing message says :

1) Dispute Document
2) Subject Navigator

Can we take call on Monday (8th February) between 11:00 AM to 6:00 PM IST ? Please provide your confirmation

Indexing message :
2021-02-05 02:11:45,989 [1] INFO - Execute IndexJob
2021-02-05 02:11:46,320 [1] INFO - Retrieve data from database
2021-02-05 02:12:17,008 [1] WARN - No tables/views found
2021-02-05 02:12:17,141 [1] INFO - Done

Feb 05, 2021 at 10:24 AM Notified 12 people

Radomir Mladenovic

Yes, we can talk on Monday after 13:00 IST

Feb 05, 2021 at 10:31 AM Notified 12 people

Radomir Mladenovic

Hi

Harsh

,
The Subject Navigator indexing was also suffering from the time out issue. I set it to unlimited. You can download indexer update from folder "2021-02-06".

I have no idea why you got the message that to tables/views were found. I'm also sending you my indexer config file in the above mentioned folder.

Maybe you should look into speeding up the view with indexes or something. I run the indexer and it took about 40 minutes to start receiving the data, and about 20 to index.

Feb 06, 2021 at 8:40 PM Notified 12 people

Radomir Mladenovic

Harsh

I also sent you the SearchController updated touse paragraph id from the HTML as the paragraph number (instead the paragraph extracted from text as it's not reliable).

Feb 06, 2021 at 9:07 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Can we take call ? We are available to discuss regarding SN & Dispute Document Library Indexing .

Feb 08, 2021 at 7:45 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The Subject Navigator Indexing are generated now. But, the Dispute indexing not generated.

indexer-config-disputes

Following is indexing log while we tried to generate Dispute Indexing :

2021-02-08 03:41:06,132 [1] INFO - Execute IndexJob
2021-02-08 03:41:06,432 [1] INFO - Retrieve data from database
2021-02-08 04:24:44,463 [1] ERROR - Failure in retrieving data from the database
System.Data.OleDb.OleDbException (0x80040E14): Could not allocate space for object 'dbo.WORKFILE GROUP large record overflow storage: 149430086533120' in database 'tempdb' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.
at System.Data.OleDb.OleDbDataReader.ProcessResults(OleDbHResult hr)
at System.Data.OleDb.OleDbDataReader.NextResult()
at System.Data.OleDb.OleDbCommand.ExecuteReaderInternal(CommandBehavior behavior, String method)
at System.Data.OleDb.OleDbCommand.ExecuteReader(CommandBehavior behavior)
at System.Data.Common.DbDataAdapter.FillInternal(DataSet dataset, DataTable[] datatables, Int32 startRecord, Int32 maxRecords, String srcTable, IDbCommand command, CommandBehavior behavior)
at System.Data.Common.DbDataAdapter.Fill(DataSet dataSet, Int32 startRecord, Int32 maxRecords, String srcTable, IDbCommand command, CommandBehavior behavior)
at System.Data.Common.DbDataAdapter.Fill(DataSet dataSet, String srcTable)
at DBIndexer.SampleDataSource.RetrieveDataFromDB() in p:\contegra\contegra-tologix\DBIndexer\SampleDataSource.cs:line 585
2021-02-08 04:24:44,582 [1] INFO - Done

Feb 08, 2021 at 10:11 AM Notified 12 people

Radomir Mladenovic

Hi

Harsh

, the indexer doesn't do anything on its own with the DB, it only invokes your view or stored prod. I'm afraid there's nothing I can do to fix the above error, it's on your end (in the database settings or some optimization).

Feb 08, 2021 at 10:45 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The Subject Navigator Indexing are created but we find any word the Web service returning the following error :

You can Find SubjectNavigator Indexing on following path :

Server : (10.68.138.10)
Indexing Path : E:\DevContegraISLGRebuildStagingIndexes\SubjectNavigatorIndex

Error :

Unable to access index D:\Contegra Indexes\tologix-index-subject\ D:\Contegra Indexes\tologix-index-subject\index_r_1.ix file is truncated. Committed size=79508586 Actual size=58556416 (file: index_r.ix); No files retrieved in search.

Please look into this and provide the solution.

Feb 09, 2021 at 11:52 AM Notified 12 people

Radomir Mladenovic

Harsh

that looks like dtSearch index got corrupted. Did the indexing process complete normally? Are you testing it on the same system where it was indexed? If it was copied to another system, maybe the copy did not complete?

Feb 09, 2021 at 12:55 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We copied indexing from server to local system. Could you please check this indexing generated properly or not ?

The indexing is generated in following path on server :

Indexing Path : E:\DevContegraISLGRebuildStagingIndexes\SubjectNavigatorIndex

Feb 09, 2021 at 1:18 PM Notified 12 people

Radomir Mladenovic

Hi

Harsh

, the index on the server looks fine to me. In its log file, you can see that the index file was 79508586:

image.png 48.4 KB • Download

and the current size is as it says later in the log, I guess after you indexed more data:

image.png 28.5 KB • Download

I guess your copying process failed.

Feb 09, 2021 at 7:27 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The Dispute Document View is taking so much time to execute and get the data so we have optimized using tempTable.

For using tempTable, We need to convert into Stored Procedure.

If we assign stored procedure in place of View name in following file in TologixDBIndexer Project then it will work ?

File Name : indexer-config-dispute-docs.json

We replace the following things :

Current : "IndexTablesViews": [ "VW_DisputeContegraSearch" ],

Replace : "IndexTablesViews": [ "SelectDisputeContegraSearch" ],

We will use Stored procedure in place of View. The Column we are getting in Stored procedure are same as View.

You can check this thing on ISLGRebuildStaging Databse.

Please let us know.

Feb 15, 2021 at 10:46 AM Notified 12 people

Radomir Mladenovic

Harsh

instead of the "IndexTablesViews" parameter, you should be using "IndexStoredProcTables" - as we did in case of the disputes indexing I believe. With that it should work.

Feb 15, 2021 at 11:55 AM Notified 12 people

Harsh Parikh, Tech Lead

OK Thanks

Radomir

. I hope It will work. We will check and will update by tomorrow if in case we find any issue.

Feb 15, 2021 at 11:57 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Could you please try to Search this thing in Dispute Document Library.

Database : ISLGRebuildStaging on server (10.68.138.11)

Search Keyword: 9REN
Lanaguage : Spanish

We are not finding any result on above search data.

Please provide your feedback

Feb 22, 2021 at 1:31 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We are using following 2 files for DisputeDocument Indexing. These files we attached from our local system so you can get idea.

indexer-config-dispute-docs.json 485 Bytes • Download

indexer-config-disputes.json 500 Bytes • Download

Feb 22, 2021 at 1:45 PM Notified 12 people

Radomir Mladenovic

Hi

Harsh

, I indexed data, reviewed logs, tested search and you're right - there are no results for your sample query. However, as far as I see, the problem is with data coming from database and your indexing config files.

Per your config, document identifier column for indexer is "ContentTypeDataMasterID". However, stored procedure "FE_MetafieldwithValueDynamic" provides IDs which are not unique in the results. For example, ContentTypeDataMasterID 12400 appears 25 times.
Whenever a row with the same ID appears, it will override previously indexed row with the same ID. In your case, because of all the duplicates, after 89434 rows indexed, there are only 9556 in the generated index. Almost 90% of data was overwritten.

Looks like you have the same problem with the VW_DisputeContegraSearch view.

We had the same discussion about this around April 17 last year. Please reference comments above around that date. Back then it was said that you will use add "RowId" as an identifier column. It see that FE_MetafieldwithValueDynamic has it, but VW_DisputeContegraSearch does not. After adding the column, update your config files ("DocIdColumnName": "RowId") and re-create Dispute indexes.

Hope this helps.

Feb 22, 2021 at 9:58 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Due to previous view there was issue to create indexing to large data so we are using the stored procedure (SelectDisputeContegraSearch) in place of VW_DisputeContegraSearch.

Can we take call today between 1:00 PM to 6:00 PM today to resolve this issue ?

Feb 23, 2021 at 4:53 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We update Indexing file of Dispute (indexer-config-disputes.json) and update "DocIdColumnName": "RowId" but still it does not work.

Please we need to take call to resolve this issue. We are available between 1:00 PM to 6:00 PM IST.

Feb 23, 2021 at 7:04 AM Notified 12 people

Radomir Mladenovic

Hi

Harsh

, please send me your indexing config files you're using now for the disputes and dispute-docs index. I'll generate indexes and review.
I'll be available for a call within the next two hours but I still need your config files.

Feb 23, 2021 at 8:38 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

As per call, Here, I have attached both Indexing file after updating RowId.

indexer-config-dispute-docs.json 480 Bytes • Download

indexer-config-disputes.json 484 Bytes • Download

Database Name : ISLGRebuildStaging

For dispute Json, we are using this SP : FE_MetafieldwithValueDynamic
For dispute-docs Json, We are using this SP : SelectDisputeContegraSearch

Both SP have unique Identifier RowId.

We checked after re-indexing but it stills does not work.

Please let us know your inputs.

Feb 23, 2021 at 8:57 AM Notified 12 people

Radomir Mladenovic

Hi

Harsh

how did you manage to re-index so quickly? For me it took yesterday at least an hour to create both indexes. Anyway, I'll get back to you as soon as generate indexes and review.

Feb 23, 2021 at 9:45 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We have made new query SelectDisputeContegraSearch in place of view VW_DisputeContegraSearch.

We are using this query SelectDisputeContegraSearch for dispute-doc indexing.

IF you will indexing both queries then within 20-25 min the indexing will be done.

Feb 23, 2021 at 9:49 AM Notified 12 people

Radomir Mladenovic

Harsh

did you run this on "ISLGRebuildProduction" as in the config you sent me? Because for me the query fails:

ERROR - Failure in retrieving data from the database
System.Data.OleDb.OleDbException (0x80040E14): Could not allocate space for obje
ct 'dbo.WORKFILE GROUP large record overflow storage: 141537287733248' in datab
ase 'tempdb' because the 'PRIMARY' filegroup is full. Create disk space by delet
ing unneeded files, dropping objects in the filegroup, adding additional files t
o the filegroup, or setting autogrowth on for existing files in the filegroup.
at System.Data.OleDb.OleDbDataReader.ProcessResults(OleDbHResult hr)
at System.Data.OleDb.OleDbDataReader.NextResult()
at System.Data.OleDb.OleDbCommand.ExecuteReaderInternal(CommandBehavior behav
ior, String method)
at System.Data.OleDb.OleDbCommand.ExecuteReader(CommandBehavior behavior)
at System.Data.Common.DbDataAdapter.FillInternal(DataSet dataset, DataTable[]
datatables, Int32 startRecord, Int32 maxRecords, String srcTable, IDbCommand co
mmand, CommandBehavior behavior)
at System.Data.Common.DbDataAdapter.Fill(DataSet dataSet, Int32 startRecord,
Int32 maxRecords, String srcTable, IDbCommand command, CommandBehavior behavior)
at System.Data.Common.DbDataAdapter.Fill(DataSet dataSet, String srcTable)
at DBIndexer.SampleDataSource.RetrieveDataFromDB() in p:\contegra\contegra-to
logix\DBIndexer\SampleDataSource.cs:line 585
INFO - Done

Feb 23, 2021 at 10:07 AM Notified 12 people

Radomir Mladenovic

and indexer for dispute-docs fails as well:

DEBUG - Getting properties for db row 0 in table SelectDisputeContegraSearch
ERROR - Object reference not set to an instance of an object.
System.NullReferenceException: Object reference not set to an instance of an obj
ect.
at DBIndexer.SampleDataSource.GetNextDoc() in p:\contegra\contegra-tologix\DB
Indexer\SampleDataSource.cs:line 156
INFO - Done

I'll try indexing staging DB.

Feb 23, 2021 at 10:09 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We are able to generate indexing. What we will do now ?

Feb 23, 2021 at 10:13 AM Notified 12 people

Radomir Mladenovic

Harsh

as I asked above, are you indexing staging or production database?

Feb 23, 2021 at 10:22 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We created Indexing in our local environment but we used same databse ISLGRebuildStaging in our local environment.

Please try to create Indexing using ISLGRebuildStaging Databse.

Feb 23, 2021 at 10:23 AM Notified 12 people

Radomir Mladenovic

Hi

Harsh

, I did indexing of Disputes using the stage database.
Search for "9REN" returns 11 results. However, when you add a language filter it doesn't find anything. The problem appears to be "SelectDisputeContegraSearch" which doesn't provide the language - it's always empty, that's why it cannot be found. (The Language that you see in the dispute results are coming from the dispute-docs index but the field has to be in the disputes as well as that's where we apply the filter.)

image.png 22.7 KB • Download

Feb 23, 2021 at 12:03 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The above indexing result is Dispute Data only we also need Dispute Document data.

The Language we are finding from Dispute Document Data. One Dispute entry is associated with multiple Dispute Documents.

We have changed the Query logic of FE_MetafieldwithValueDynamic. Now each value has one row. Because if we using Pivot in query it is taking so much time due to data.

So we modified the query and make each value has each row. if you find RowId (211742, 211743, 211744) the language data is available. so when user search with 9REN searchrequest and language Spanish then all data we need.

For Dispute Indexing the Row Id for Documents are (51167,
51168,
51169,
51170,
51171,
51172,
51173,
51174,
51175,
51176,
51177,
51178,
51179,
51180,
51181,
51182,
51183,
51184,
51185,
51186,
51187,
51188,
51189 )

and Dispute Document indexing the is the RowId Documents are(
211742, 211743, 211744
)

We need all Dispute & Dispute Document Data while user combined the search

Please suggest.

We are able to take call and discuss.

Feb 23, 2021 at 12:47 PM Notified 12 people

Radomir Mladenovic

Hi

Harsh

,

On September 9 you wrote: "As discussed, You will get one column DisputeId from 1st SQL View and you need to pass DisputeId column value to 2nd SQL View and provide us JSON format result to bind data as per our model."

That's how it works:

The keyword search and filters are run against the "disputes" index. (For indexing we're using FE_MetafieldwithValueDynamic, correct?)
All different DisputeId's are collected.
In the "dispute docs" index, which was created using "SelectDisputeContegraSearch" (correct?), we find all documents where ContentTypeDataMasterId matches one of the collected DisputeId's. - BTW, as a reminder, your team made this change back in September!
Matches from the "dispute docs" are returned as a result.

Row 211743 that you mention above is in the dispute docs index but I don't see any language fields in the disputes data. To the 2021-02-23 folder on OneDrive I put indexing logs. Please check the indexing-disputes-stage.log and tell me which data row corresponds to row 211743 in the dispute docs.

Feb 23, 2021 at 1:48 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Thanks for update. We have updated FE_MetafieldwithValueDynamic query same as previous and we fetched data as per ContenttypedataMasterId. Now no need to change in config file.

But, Now for 2nd Query SelectDisputeContegraSearch we need all ContenttypedataMasterId result.

Suppose, in second query if Claimant Column data in multiple row so we need all rows in model.

Currently, your API returns only one Row.

Please let us know so we can take call and communicate.

Feb 24, 2021 at 7:49 AM Notified 12 people

Radomir Mladenovic

Hi

Harsh

, there are two things here:

Which field in the 1st index ("disputes") identifies records you need in the results? (currently it's "DisputeId" - is that the one you need?)
Which field in the 2nd index ("dispute docs") should be matched with the above collected values? (currently it's "ContentTypeDataMasterId")

Can you answer on these?

If #1 needs to be changed:

The property "FacetedFields" in the indexer config file needs to be updated as it's set to "DisputeId" currently.
Method SearchDisputes in the SearchController needs to be updated to use the proper field.

If #2 needs to be changed:

Method SearchDisputes in the SearchController needs to be updated to use the proper field.

Sorry, I'm not available for a Skype call today. In any case, I hope it's not needed as I believe you have all information you need - you have been modifying the SearchController before.

Feb 24, 2021 at 8:37 AM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The Dispute Document issue we resolved. We added RowId for SelectDisputeContegra (2nd Query) and change in config file for disp-doc to set "DocIdColumnName": "RowId" and it seems work now.

Feb 24, 2021 at 12:41 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The Other issue we found that if we search keyword with "Tribunal" in Full Text Search module. You API is taking 8 to 10 second to just give response to us. the total record we found near about (6000).

Could you please improve that and provide updated Search Controller ?

You Can use database ISLGRebuildStaging.

Feb 24, 2021 at 12:44 PM Notified 12 people

Radomir Mladenovic

Hi

Harsh

, how many results are you pulling when you search for "Tribunal"? Can you send me your results JSON for this search?
(to test it on my own I'd have to create full index all first)

Feb 24, 2021 at 1:02 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Near about 6000 total records we are displaying on FTS Page after convert JSON in to Model. Following is the JSON for this search.

Tribunal FTS Response.json 584 KB • Download

Feb 24, 2021 at 1:38 PM Notified 12 people

Radomir Mladenovic

Harsh

which database and documents folder should I use to generate the FTS index? If you have a copy of the index on 10.68.138.10 then just let me know where so I could copy it.

Considering the huge results response, the search time is not that surprising, especially because we don't limit results in order to have proper sorting by field. Do you use the option to sort search results? By which field do you sort? If you can, please send me the complete search request json.

Feb 24, 2021 at 2:13 PM Notified 12 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

You can use following database to generate Indexing.

Database Name : ISLGRebuildStaging (10.68.138.11)

The Document Path is Following on Server ip (10.68.138.10)

Document Path : E:\ISLGRebuildStaging\wwwroot\Documents

By default, We sorting on Relevance. Following is Criteria.

Relevance (default) - relevancy criteria based on current functionality of "hit count".

Following is the JSON File which we passed when we search with "Tribunal" word.

Tribunal FTS Request.json 1.3 KB • Download

Feb 25, 2021 at 5:07 AM Notified 12 people

Radomir Mladenovic

Hi

Harsh

, thanks, I'll generate index and download for local testing. It may take a while so I will probably not have an answer by the end of your work day today.

Feb 25, 2021 at 8:05 AM Notified 12 people

Harsh Parikh, Tech Lead

OK

Radomir

. Please provide your inputs by tomorrow to resolve this thing.

Feb 25, 2021 at 8:07 AM Notified 12 people

Darsh Shah

Hi

Radomir

,

I am the QC guy from DEVIT side. I have raised one issue for FTS Search :

Step to Reproduce:

1) Go to FTS module
2) Search the text "Tribunal AND Absence" with selecting "Boolean" search type
3) Click on Search button

Actual Result: System displays the files which has Tribunal or Absence word.
Expected Result: System should display the files which has Tribunal and Absence both words.

Feb 25, 2021 at 12:54 PM Notified 13 people

Darsh Shah

Hi

Radomir

,

As per your comment for "21138 - Search > Keyword not highlighted" issue, we delete the noise word and did re-indexing.
After completion of re-indexing, when the user search the noise keyword "Like", Search is working properly but a "Like" word is not highlighted in the result.

Feb 25, 2021 at 1:06 PM Notified 13 people

Radomir Mladenovic

Harsh

the performance issue comes from the fields highlighting. dtSearch does not support fields-only highlighting so, they way it's implemented now, is that we highlight the whole cached document and then extract the fields part. The problem is documents are big (I see each 1-2MB each) and it's a lot of time wasted for something we throw away. I have an idea how to approach this but need several hours so it will not be ready today. I should an update by Monday at latest.

@Darsh
1) Search for "tribunal" gives 6059 results, search for "tribunal AND absence" gives 2137 results. By the numbers, it looks fine. Can you give me more details about the context where you see this issue?
2) That still sounds like the index was generated with stopwords list. How many results you have searching for "like"? If it gives back all documents and all paragraphs, it's still there as a stopword.

Feb 26, 2021 at 8:07 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

OK. Please provide the update on Monday.

For Darsh Issue,

He said that when we search with "tribunal AND absence" There are 2137 result found and then user click on any paragraph only "tribunal" word is highlighted. But, As per expectation only those paragraphs need to be find where both "tribunal AND absence" words are available.

Feb 26, 2021 at 8:49 AM Notified 13 people

Radomir Mladenovic

Harsh

, for the Darsh issue, on Jan 22 you said for issue #1136 "A paragraph link must exist for each paragraph in the document where at least one of the keywords from the search was found"
Isn't that in conflict with this issue where you want only those paragraphs where both are found?

Feb 26, 2021 at 9:56 AM Notified 13 people

Harsh Parikh, Tech Lead

Radomir

, but if we apply dtsearch rules that user search with "And" keyword (ex. tribunal and Greek) with Boolean then our expectation is, if both word matches in html or pdf then only those documents we need to display.

I am not 100% sure regarding Dtsearch rules.

Morgan

, Please suggest.

Feb 26, 2021 at 10:12 AM Notified 13 people

Radomir Mladenovic

Harsh

two separate (and a bit different) searches happen for FTS and paragraph highlighting.

Using AND implies you expect documents containing both (or all) keywords. That's fine, that's how FTS currently works - you get documents containing both keywords in the complete document.
When showing paragraphs within the document, we show all paragraphs containing any keyword. Initially this was showing only paragraphs containing both keywords but after your feedback for issue #1136 this was modified to show "any".

Please, let us know what it the desired behavior on your end because expected result for #1136 and the issue reported by Darsh are in conflict.

If you prefer only paragraphs containing all keywords, then we have to modify how how search works - the main search should be in the paragraphs index then, followed by getting documents where matching paragraphs are found, not vice versa.

Feb 26, 2021 at 10:35 AM Notified 13 people

Morgan Maguire, CEO

Hello

Radomir

,

Harsh

and

Darsh

,

Here is a video explaining how the highlights should work depending on whether the user is performing a search with "All Words" or "Boolean" when "and" is included as a search term. If there is any further confusion on these requirements, please refer to how the search results are produced in the legacy application:

Clarification on FTS requirements-2 (26-Feb-21).mp4 19.4 MB • Download

Thanks,

Morgan

Feb 26, 2021 at 5:37 PM Notified 13 people

Radomir Mladenovic

Harsh

@Darsh, I think the highlight of paragraphs currently behaves as in Morgan's video - any keyword of an AND boolean expression is highlighted in a paragraph (in the video that's a page), not only in paragraphs that have both keywords. If I'm missing something, please let me know.

Feb 26, 2021 at 9:16 PM Notified 13 people

Radomir Mladenovic

As of highlighting for "like", I tested it and and it works. (I generated index for stage using an empty stopwords list.)

image.png 60.6 KB • Download

Feb 27, 2021 at 7:59 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We have removed noise word from noise.dat from TologixDBIndexer Project and then re-indexing.

Is there any other noise list which we need to remove ?

Feb 27, 2021 at 8:40 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, the indexer is using noise.dat file from the folder where the indexer exe is. That's all.
After changing the stopwords list you have to delete the old index and re-index because the noise words are copied to the index folder when index is created. In an existing index you can see used stopwords in index_n.ix file.

Feb 27, 2021 at 9:45 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We did Same thing. In following path of server we have latest Indexing for all modules. Could you please check it that is it OK or Not ?

Server : 10.68.138.10

Indexing Path : E:\DevContegraISLGRebuildStagingIndexes

Feb 27, 2021 at 9:52 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, FTS index looks OK.

Feb 27, 2021 at 10:17 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, you can find updated indexer and search in folder 2021-02-27 on OneDrive - https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=gfh2JQ

In order to workaround the performance issue I had to create an additional index as a part of FTS indexing. On my system, search for "tribunal" is about 10x faster now.

To specify path of this new index use property "FieldsIndexDir" of the indexer config file for FTS.I sent you a sample config file as well.

The SearchController has been updated as well. You need to add "FullTextFieldsIndex" with path to the new index to the application.conf

Let me know how this worked for you.

Feb 27, 2021 at 8:14 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We are going to do changes as per your above instruction. But in Web Search Project after copied SearchController file the following error is given .

image.png 245 KB • Download

Please suggest.

Mar 01, 2021 at 5:52 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The Indexer is also not cratering any indexes and given following error after copied TologixDBIndexer.exe file.

System.InvalidOperationException: Nullable object must have a value.
at System.ThrowHelper.ThrowInvalidOperationException(ExceptionResource resource)
at System.Nullable`1.get_Value()
at DBIndexer.CmdLineIndexer.runMainIndexer(DocFieldsDataSource fds, ParagraphDataSource pds) in p:\contegra\contegra-tologix\DBIndexer\CmdLineIndexer.cs:line 199
at DBIndexer.CmdLineIndexer.run() in p:\contegra\contegra-tologix\DBIndexer\CmdLineIndexer.cs:line 123
at DBIndexer.MainForm.Main(String[] args) in p:\contegra\contegra-tologix\DBIndexer\MainForm.cs:line 181

Mar 01, 2021 at 5:59 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, sorry, I though it was obvious from the context that that field is just one more string. I'm attaching the full file.

AppSettings.cs 3.3 KB • Download

I'm checking indexer and will be back to you in a minute.

Mar 01, 2021 at 11:33 AM Notified 13 people

Radomir Mladenovic

Harsh

you can get updated indexer from the 2021-03-01 folder on OneDrive.

Mar 01, 2021 at 11:41 AM Notified 13 people

Harsh Parikh, Tech Lead

OK Thanks

Radomir

. Will check and let you know if we face any issue.

Mar 01, 2021 at 11:42 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We are getting error when we are trying to generate FTS indexing.

The Subject Navigator & Dispute Document indexing is created suceeesfuly.

Following, is our local Indexer config file for FTS and Indexing log file of FTS.

indexer-config-fts.json 907 Bytes • Download

indexing.log 1.79 KB • Download

Mar 01, 2021 at 12:34 PM Notified 13 people

Radomir Mladenovic

Harsh

, path for the "FieldsIndexDir" is not properly encoded in your config file. The back-slash requires two "\\". See how it's done for other properties:

image.png 29.4 KB • Download

Mar 01, 2021 at 12:40 PM Notified 13 people

Harsh Parikh, Tech Lead

Thanks

Radomir

. I have missed that

Mar 01, 2021 at 12:41 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The FTS indexing is stuck after 2 RowId. It is stuck from last 15 minute. Here, I have attached log file.

indexing.log 538 Bytes • Download

Mar 01, 2021 at 1:11 PM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, that's strange. I generated index for stage on Saturday without any issues. (You can even find my indexes somewhere under C:\temp I believe, I cannot login to the VPN at the moment.)
To troubleshoot, try enabling DEBUG logging level, re-start indexing and see what's logged.

Mar 01, 2021 at 1:52 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

I have again started the Indexing but still after 2 RowID. It is stuck. How I will enable Debug mode ?

We are currently try to indexing in our local environment. but database is same as ISLGRebuildStaging.

Mar 01, 2021 at 2:01 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

After updating PDFHighlighter URL, still FTS indexing is stuck after ROWID 2. Please suggest how we will resolve this issue ?

Mar 02, 2021 at 6:40 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, did you try enabling the DEBUG logging level? In the TologixDBIndexer.exereplace INFO with DEBUG:

<root>
<level value="DEBUG" />
<appender-ref ref="ConsoleAppender" />
<appender-ref ref="FileAppender" />
</root>

BTW, is there a problem with the VPN currently? I wanted to try re-indexing on the 10.68.138.10 server but I cannot connect although I'm on the VPN. Any idea?

Mar 02, 2021 at 7:33 AM Notified 13 people

Harsh Parikh, Tech Lead

No

Radomir

. I am able to connect VPN on 10.68.138.10.

Also, I am not able to find TologixDBIndexer.exereplace in TologixDBIndexer Project. Could you please guide us how to enable debug mode ?

Mar 02, 2021 at 7:38 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Can we connect quickly on Skype to resolve this FTS indexing issue in our local environment ?

Mar 02, 2021 at 7:42 AM Notified 13 people

Radomir Mladenovic

The VPN issue was something on my end. It works after restarting the system.

I'm trying to run indexing again - currently waiting for your proc to give results and start indexing. I'll get back to you soon.

Mar 02, 2021 at 7:55 AM Notified 13 people

Radomir Mladenovic

Harsh

for me indexing works on the server:

2021-03-02 03:02:54,095 [1] DEBUG - Indexing file E:\ISLGRebuildStaging\wwwroot\Documents\HTMLFiles\ARB-0011 - ICSID Institution Rules (1984).html
2021-03-02 03:02:54,235 [1] DEBUG - Complete DocFields: rowid	3	contenttypedatamasterid	10697	documentcontenttypeid	13	documentid	3	documentcontenttypename	Arbitration Rules	pdffilename	ARB-0011 - ICSID Institution Rules (1984).pdf	htmlfilename	ARB-0011 - ICSID Institution Rules (1984).html	fullcitation	ICSID Institution Rules (1984)	fullcitationtext	ICSID Institution Rules (1984)	language	English	validfrom	26 September 1984	validto	31 December 2002	validtopresent	false	disputeid	0	proceedingstageid	0	proceedingstageorder	0	issubsequentdevelopments	false	refjuriscount	0	issuingorganization	International Centre for Settlement of Investment Disputes	ispdfonly	false	isuploadskipped	false	treatytypeid	0	sortingdate	19840926	field_22	ICSID Institution Rules (1984)	field_23	ARB/0011	field_24	19840926	field_25	20021231	field_26	false	field_27	368	field_32	29	internal_file_id	0278A68311C95612114AB0EC4E1CA84D	internal_rec_id	BE326C0A9485CE198199B791AC196736
2021-03-02 03:02:54,235 [1] DEBUG - Getting properties for db row 3 in table FE_MetafieldwithValueDynamicFTS
2021-03-02 03:02:54,235 [1] INFO  - Index doc db://FE_MetafieldwithValueDynamicFTS#RowId=4
2021-03-02 03:02:54,235 [1] DEBUG - Handling column: RowId = 4
2021-03-02 03:02:54,235 [1] DEBUG - Handling column: ContentTypeDataMasterId = 10698

Please send my your config file. I'd like to check it and try it.

I don't think that having a skype now is productive considering we have to wait almost 30 minutes for the stored procedure to give results. Can you reproduce the issue indexing dev database? Or maybe you can isolate the issue by checking what row #2 is and making a proc that returns data set onlt with that row?

Mar 02, 2021 at 8:10 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Following, I have attached my config file of FTS.

indexer-config-fts.json 927 Bytes • Download

I am going to Pull ISLGRebuildStaging database from server (10.68.138.11) in my local system and again try to re-indexing. If we still get same issue then will let you know.

Mar 02, 2021 at 8:43 AM Notified 13 people

Radomir Mladenovic

I cannot connect to your database so not able to reproduce this.
Let me know how it worked for you indexing stage.

Mar 02, 2021 at 9:17 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The Indexing issue is still there. We have download whole ISLGRebuildStaging database from server and hosted in our local environment but still after 2 Row the Indexing is stuck.

Is there we missed any thing in TologixDBIndexer project ?

Can we take call and sort out this issue ? It is critical for us to check FTS module with Indexing.

Mar 02, 2021 at 9:59 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, I have another call at 12:00 my time. I'll ping you now on skype and let's see what we can do by then...

Mar 02, 2021 at 10:30 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

I have copied your DBIndexer3 Project from Server and replaced my indexer config but now we are getting following error.

image.png 258 KB • Download

Mar 02, 2021 at 11:09 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The Old Indexer is working fine for FTS. there is might be something with new Indexer exe which you provided on 1st March.

Please look into this and let us know how we will resolve this issue to generate Indexes for FTS module.

Mar 02, 2021 at 2:22 PM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, there's nothing special in the new indexer except it starts one more process threads for the third index. Could it be some limitation of the system you're using? Can you try indexing on the server as there it worked fine for me.
I'll add more logging to the indexer so that we can figure out where exactly stops for you.

Mar 02, 2021 at 2:54 PM Notified 13 people

Radomir Mladenovic

Harsh

I prepared the indexer update with additional logging. Please take the update from 2021-03-02 folder on OneDrive, run it and send me the log. (You can stop it as soon as you notice that it hanged.)

Mar 02, 2021 at 3:42 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We have updated TologixDBIndexer.exe file and tried to generate FTS indexes and following log is started.

Please confirm that is it Ok or not.

image.png 242 KB • Download

Mar 03, 2021 at 6:22 AM Notified 13 people

Radomir Mladenovic

Harsh

you see this error in the log?

image.png 31.4 KB • Download

I believe this is why it doesn't work for you. Check the path for the fields index in your config file - I guess you didn't put double back-slash in the folder path.

Mar 03, 2021 at 6:40 AM Notified 13 people

Harsh Parikh, Tech Lead

Yes true

Radomir

.. We have updated double-slash and now again try to re-indexing.

We will let you know if we still face an issue.

Mar 03, 2021 at 6:42 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The FTS indexing is created now. but we haven't seen to improve any performance issue. When we search with "Tribunal" word it is still taking so much time to get the response from API.

Mar 03, 2021 at 9:19 AM Notified 13 people

Radomir Mladenovic

Harsh

did you copy all 3 indexes? did you set up the new fields index in the web app config?

Mar 03, 2021 at 9:40 AM Notified 13 people

Harsh Parikh, Tech Lead

Yes

Radomir

. We have created new indexes which created in all 3 FTS folders. Also, we set up new fields index in web app confing. Here, I have attached my appseeting.json file.

appsettings.json 1.42 KB • Download

Mar 03, 2021 at 9:43 AM Notified 13 people

Radomir Mladenovic

That doesn't make sense. Did you copy the SearchController changes?

Mar 03, 2021 at 9:56 AM Notified 13 people

Harsh Parikh, Tech Lead

Yes

Radomir

..I have also copied SearchController as well.

Mar 03, 2021 at 9:57 AM Notified 13 people

Radomir Mladenovic

Harsh

run the server in debug mode, put a breakpoint on line 151 (where the return from the method is), run search, when it stops at the breakpoint send me values of variables time1, time2, ... time5.

Mar 03, 2021 at 10:02 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Following are values.

time 1 : 5308
time 2 : 1895
time 3 : 6157
time 4 : 10232
time 5 : 3566

Mar 03, 2021 at 10:22 AM Notified 13 people

Radomir Mladenovic

Harsh

that looks very slow. What are your system specs?
I'm getting way lower values, all done in 2s on the first run, the next one 1.5s.

image.png 6.36 KB • Download

image.png 32.7 KB • Download

Mar 03, 2021 at 10:44 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

I hosted published code in different IIS and it seems fine. Let us know once you will be completed the FTS changes as per discussion on Monday (1st March)

Mar 03, 2021 at 10:51 AM Notified 13 people

Morgan Maguire, CEO

Hi everyone,

Further to our discussions on the searches in Full Text Search and Subject Navigator during my meeting with DevIT and Industrial earlier today and to give

Rob

and

Radomir

further context to the problems we're experiencing, I created the video below outlining the issues I've discovered within these searches:

Fuzzy typo is working as expected (not an issue) in both the Subject Navigator and Full Text Search.
- Naomi , note that the fuzzy typo function is proportionate the number of characters searched. Therefore, for a term like "abose" it may be too few characters to allow a fuzzy type setting of "1" to compensate and produce results for "abuse".
The speed of both searches is slower than expected (and there should be a loading indicator while the searches are loaded).
The Subject Navigator search results are incomplete.
Boolean operators are not working within the Full Text Search.
There is no way to access PDF results within the Full Text Search.
Harsh has indicated that the indexes are extremely large (30GB+).

I'm going to have a call with

Rob

this afternoon to discuss these issue, and if required we'll setup a call tomorrow at 8:15am Vancouver time to discuss these issues as group.

Thanks,

Morgan

Problems with dtSearch in new application (3-Mar-21).mp4 21 MB • Download

Mar 04, 2021 at 8:20 PM Notified 13 people

Radomir Mladenovic

2. What are the specs of the system running the staging application?

3. I can investigate this.

Harsh

please send me configuration file (or parameters needed) to index Subject Navigator data in staging.

4. I know why - after it was said to get paragraphs containing any keyword in query, I changed search type of the query for paragraphs to "any keyword". (The man search for the documents is still running as boolean search.) This can be fixed but, if we need to change search to match all keywords in the paragraph, then the complete FTS search needs to be reworked so no point in fixing it.

5. I will investigate this.

6. It's a tricky one. Indexes are huge because in the created index we cache the complete document text and the original file. This is needed by dtSearch to do highlighting faster, although I think it's really needed only for PDF highlighting.
It's possible to make this PDF highlighting work without storing the original file content in index, but at the cost of the highlighting performance.
An another issue with removing original files from the index is that we'd have to create a separate index only for PDFs, using full document path. This is because the current organization of data allows the same file to appear in multiple indexed documents - I hit this issue at the beginning of FTS implementation and had to make changes to accommodate data in database.
In short, we can bring the index size down by creating one more index.

Mar 04, 2021 at 9:45 PM Notified 13 people

Morgan Maguire, CEO

Hi everyone,

I just had a call with

Rob

. He is going to connect with

Radomir

tomorrow to work through solutions to the issues above, come back with a plan to tackle each issue and connect with

Harsh

and the team as required.

In the interim,

Harsh

could you please provide

Radomir

with the configuration file for the Subject Navigator index so that

Radomir

can examine the issues with missing results in the search.

Thanks,

Morgan

Mar 04, 2021 at 10:49 PM Notified 13 people

Radomir Mladenovic

Harsh

I don't need the config for the Subject Navigator any more. I generated the index using the same stage db and getting the same number of total results for the "good faith" search.
Now, to troubleshoot this issue, I need from you:

The payload of the search request you're sending.
At least one row number of the stored procedure data in which requested data appears but you're not getting it in the search results. For example, a row for one of those items that should appear in the A or B section mentioned by Morgan.

Thanks,
Radomir

Mar 04, 2021 at 11:11 PM Notified 13 people

Radomir Mladenovic

Harsh

please also provide a sample of search that returns PDF without pages. Using FTS index I generated for stage, I run thefloowing search for "tribunal", limited to ispdfonly=true:

{
    "searchRequest": "tribunal",
    "FilterStatement": {
        "type": "boolean",
        "Operator": "and",
        "clauses": [
            {
                "type": "match",
                "field": "ispdfonly",
                "values": [
                    "true"
                ]
            }
        ]
    },
    "SearchType": "3",
    "Stemming": false,
    "Synonyms": false,
    "Fuzzy": false,
    "Fuzziness": "1",
    "SortField": "hits",
    "SortOrder": "desc",
    "PageNum": 0,
    "PageSize": "20"
}

and all results haveparagraphs (pages in case of pdf) returned - you can see this by line numbers:

image.png 26.3 KB • Download

Did you have PDF documents on your system when you generated the index? Any errors in your FTS indexing log?

Mar 05, 2021 at 8:09 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Have you updated Indexer for FTS ?

Mar 05, 2021 at 8:19 AM Notified 13 people

Radomir Mladenovic

Harsh

no, I used the save indexer that I sent to you.

Mar 05, 2021 at 8:21 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We checked with tribunal and nationality words and it works fine. We get only those documents which have pinpoint references.

Mar 05, 2021 at 9:07 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We are looking in to Subject Navigator SQL View and will check again (2nd Point).

Let us know once you completed rest of points.

Mar 05, 2021 at 9:17 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, I'm not quite sure your message about what you checked so far. From my point of view, there's only #4 that needs to be addressed, and I'll look into #6.
If you see other issues, please provide details as I asked you above so that I can cross-reference data and index created.

Mar 05, 2021 at 12:11 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Can we take call for Subject Navigator result ?

Mar 05, 2021 at 12:12 PM Notified 13 people

Radomir Mladenovic

Harsh

I'm not available for a call today. Please,provide me with data I asked for and I'll take a look at it.

Mar 05, 2021 at 12:13 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

What you need from my side. if I am going to search with good faith then result is not generated properly.

Please let us know what I need to provide you.

Mar 05, 2021 at 12:17 PM Notified 13 people

Radomir Mladenovic

Harsh

I'm confident that the indexer is working correctly for data it got. If you don't see the results expected, please show me that the missing data is returned by your stored procedure in the first place. I cannot troubleshoot this if I don't know what we're looking for.

So, as asked above, I need from you:

The payload of the search request you're sending.
At least one row number of the stored procedure data in which requested data appears but you're not getting it in the search results. For example, a row for one of those items that should appear in the A or B section mentioned by Morgan.

Mar 05, 2021 at 12:26 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Please see following screenshot. If search with good faith then we need following row.

You can find this data on ISLGRebuildStaging databse. We have just updated the SQL View.

Id : 24338
BranchID : 17559
ParentId : 5579
Branch Name : See "Good faith"
HierarchicleParentIds : 1,5579,17559,4202,4661,5207,5579,5580,5583,5585,5597,5606,5657,5661,5686,5688,5693,5695,5708,5724,7523,7826,8002,8142,8258,8594,9588,9684,10431,11241,11258,11685,12649,12959,13084,13181,13666,13693,14749,15532,15562,16154,17381,17559,19044,19092,20332

SelectedNodes : 1,5579,17559

image.png 352 KB • Download

Mar 05, 2021 at 12:32 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

I also noticed that there is no any Branch good faith under B then why your JSON response provide B branch.

There is no any associated detail with good faith word under B branch.

Mar 05, 2021 at 1:12 PM Notified 13 people

Radomir Mladenovic

Harsh

I don't know which info points to "B" section, it's your data and I don't know how to interpret. I really need more info in tracking this down. Make sure the data is in the data set indexed (e.g. row number), send me your request payload and what you expected to get back but not received. Which field should I be looking at?
I just indexed SubjectNav after you updated the proc and will check the previous thing you sent me.

Mar 05, 2021 at 1:31 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

It is Parent - Child Branch Structure in Subject Navigator. The good faith word under branch name A > Abuse of process > "see good faith". But your response is not providing A branch nodes.

It is very complicate to provide you the result of all branches but at first point we need this nodes from response.

Can we take short call quickly so i can provide you detail ?

image.png 236 KB • Download

Mar 05, 2021 at 1:36 PM Notified 13 people

Radomir Mladenovic

Harsh

I have some time atm, I'll give you a call shortly.

Mar 05, 2021 at 1:43 PM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, are you using pagination with the Subject Navigator search? If you don't pull all results then it's normal that you don't get all nodes you're expecting. When I get all results, I see node "1" in the SelectedNodes.
However, I see an issue with this search because nodes in the SelectedNodes are not unique and values repeat a lot. I'll change that.
Another issue I see with this search is that, when you pull all results, the search is slow because all results are being highlighted. I think we should also use progressive highlighting for this. I'll add some options for this.

Mar 05, 2021 at 3:09 PM Notified 13 people

Harsh Parikh, Tech Lead

Ok

Radomir

. Please provide updated solution. we will availbel tomorrow between 11:00 AM to 3:00 PM IST. so will check and let you know.

Mar 05, 2021 at 3:49 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We are not using Paging in SN module. we need to get all data.

Mar 06, 2021 at 12:28 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, I'm sending you the search service updates for the Subjects Navigator search. The files are in the 2021-03-06 folder on OneDrive.

1) My understanding is that for the Subjects Navigator you need all results immediately. I believe you were not getting some nodes because you were taking the first page only and the nodes are not in order.
The "subject-nav" search method ha been modified to return all results, but without highlighted fields. That should allow you to build the complete page with all the content.

One more new thing is that in the response you can find "timeLog" array with a log of time spent in different stages. For example, executing Subjects Navigator search for "good faith" on my system takes 620ms:

image.png 4.31 KB • Download

I would suggest logging "timeLog" to the browser console. Then you can easily review this when testing.

2) Next, to get highlighted fields for the Subjects Navigator search, you need to use "highlight-subject-nav" service method. The request payload is the same as for the "subject-nav", with addition of "FieldFilterName" and "FieldFilterValues" fields that limit results.
For example, after adding the results (from "subject-nav") to the page, you could collect the "id" field values of all results visible in the viewport, and possibly maybe for another more page, and send request to "highlight-subject-nav":

{
    "searchRequest": "good faith",
    "SearchType": "AllWords",
    "Stemming": false,
    "Synonyms": false,
    "Fuzzy": false,
    "Fuzziness": "2",
    "FieldFilterName": "id",
    "FieldFilterValues": [18379, 5602, 3082]
}

That would return you results only for those 3 nodes (where id is 18379, 5602 or 3082), with "highlightedFields" included. Take the highlighted fields and update the page.

Mar 06, 2021 at 12:36 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Following bugs are found by Industrial team in Subject navigator module. Could you please check and provide update :

Bug no : 22141 [staging] any words > search doesn't disregard words such as "and" or "or"

Steps to reproduce:

Set search type to "any words"
type a search like "abuse and ego"

Result:The results highlight the word "and" but not abuse or ego

Expected:Words such as "and" or "or" will be ignored in the any words search type, and abuse or ego would be found as matches

Bug no : 22160 [Staging] Stemming > Not working

Steps to reproduce:

Search options > check the box for stemming
Type in the word "like"
view results
Reset search
Type in the word "likelihood"
View results

Result:The results for "likelihood" don't appear when searching for "like"

Expected:If stemming is selected, I should see results for "likelihood" when searching for like

bug No. 22162 : [Staging] Synonyms > Not working

Steps to reproduce:

Type in "bias" in search field
Check the box for "synonym"
Enter search
View results
Search "discrimination"
View results

Result:No synonyms appear for that word, only bias is shown

Expected:Will see highlighted results for synonyms

Mar 06, 2021 at 7:23 AM Notified 13 people

Radomir Mladenovic

Harsh

regarding the missing "A" node, it happens because we're hitting dtSearch query size limit (70k characters) pulling hierarchy IDs. I'll refactor this and provide you with an update later today.
However, related to change we made in code during our skype call, I'd suggest you to review on your side do you neednodes from both "SelectedNodes" and "hierarchicalparentids". It looks like that the later one pulls in much more nodes.

Mar 06, 2021 at 9:42 AM Notified 13 people

Harsh Parikh, Tech Lead

Yes

Radomir

.. We checked and we need all hierarchicalparentids. Because within Search we also render other branches as well to user navigate. so we need all hierarchicalparentids branches result.

Mar 06, 2021 at 9:44 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

Bug no : 22141 - That's how dtSearch works. You put a keyword into the query and asked for any/all words. If you don't want it, don't put it into the query box. Or maybe you can add it as a stopword.

Bug no : 22160 - To my knowledge, "like" is not a stem of "likelihood" - at least not as a default stem in search engines. Where did you get this example? Does this work like that in the legacy application? If it does, you probably have custom stemming rules file (stemming.dat) so please send it to me for a review and inclusing into the project.

Bug No. 22162 - Similar to the previous, do you use thesaur.xml in the legacy application? Or maybe use WordNet synonyms? If you do, you should be able to find these on the legacy server. Check https://support.dtsearch.com/dts0190.htm for info in which folders this might be found.

Mar 06, 2021 at 10:01 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Naomi

,

Please see above comment of

Radomir

and suggest or guide him regarding Subject Navigator bugs which you produced..

Mar 06, 2021 at 10:18 AM Notified 13 people

Naomi Joanis, UX Team Lead

Hi

Harsh

and

Radomir

,

In testing the Subject Navigator search, I am using the description of the search types from the legacy app and comparing results to the legacy app.

#22141 – the description of "Any Words" search type is:

"An "any words" search request consists of an unstructured natural language or "plain English" query. In a natural language search request, words such as AND and OR are disregarded. Use quotation marks to indicate a phrase, + (plus) to indicate a word that must be present, and - (minus) to indicate a word that must not be present."

and when I search "abuse and ego" on the legacy application, I get results for "abuse" and "ego and not the word "and". This is not occurring on the current application where I am only seeing results for the word "and" and not results for either "abuse" or "ego".
#22160 – I should clarify that even with stemming checked off I am not seeing any stem matches found. For example, searching "like" on the legacy application shows me results for "likely" and "likeness". I am only seeing matches for "like" on the current application even when stemming is checked off.
#22162 – Harsh , this seems like a question for you in terms of how we are matching synonym terms.

Please let me know if any of the tests I've performed above aren't accurate or should be modified, however, going by the results presented in both applications there do seem to be some issues related to the search options.

Thanks,

Naomi

Mar 06, 2021 at 2:37 PM Notified 13 people

Radomir Mladenovic

Naomi

#22141 - It's not that only "and" is being highlighted here. As the current version on test is not getting the complete list of results, you're seeing only the most "relevant" - the "and" word appears more often results with it got to the top of the results list. If you remove "and" from your query, I'm sure you'll see the other two keywords highlighted.
But, from your description of the legacy behavior, looks like stopwords are used there. Harsh can you check the legacy subjects navigator index for the noise word file (index_n)?
#22160 Looks good on my end. With stepping enabled, I see "likeness" found when "like" is used:

image.png 83.9 KB • Download

Mar 06, 2021 at 5:07 PM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, in the 2021-03-07 folder you can find updated search service and indexer.

The indexer should fix the issue with FTS where keywords were matched in the document filename when searching for paragraphs. The results were showing all paragraphs of the document in that case although there were no text patches. You need to re-build FTS index.
Update the FTS indexer config with option "CacheText": false. That will prevent caching of the original documents in the index and significantly lower the index size. During the next week I'll prepare an update that approaches PDF highlighting in a more lightweight way. As you don't have PDF highlighting integrated for now, it doesn't affect you.

Next, the search service have been updated to return all results and nodes for the Subjects Navigator search. However, I'd recommend checking again if you need all nodes from the "SelectedNodes" and "hierarchicalparentids" fields. For example, search for "good faith" has 734 matches and returns more than 8500 nodes. For example, I don't understand why node "A", which suppose to be at the top level, references other nodes:

image.png 26.5 KB • Download

On my system this search takes about 3.5 seconds to execute and I cannot make it faster as we're fetching thousands of dtSearch documents. Running the search and getting these documents is about 0.2s, but getting the referenced nodes adds 3s more:

image.png 6.16 KB • Download

The response payload from the search service is more than 17 MB in size. That's a huge response which will take extra time to process on your side as well.

The update also addresses highlighting for boolean expressions for FTS, but I'd recommend also building the new FTS index using the updated indexer before testing this.

P.S. Tomorrow (Monday) I'll be off.

Mar 07, 2021 at 10:02 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Morgan

and

Radomir

,

As per above comment for FTS module,

Radomir

, will provide update on this by next week.

Morgan

, As per above comment of

Radomir

, We are rending all the child branches of that particular Parent branch and it will take 15 second to loads when search with "good faith" word.

As per my suggestion, We need to change the logic and render all the matching keywords parent and child branches only rather than all branches.

Radomir

, Currently, The word is not highlighting with matching keyword as you changed the logic. But, We need highlight word as per previous logic like we need HighlightedFields within result of sub-nav. As we can't implement new logic at this time.

Mar 08, 2021 at 9:57 AM Notified 13 people

Radomir Mladenovic

Harsh

, you can get highlighted fields if you use "highlight-subject-nav" service (the message from March 6th). It's simple to re-enable fields highlighting on the basic search service but that will increase search response time from 3 seconds to 16 or even more, I don't remember exactly.
If you're sure you want to do this, in the SearchController line 249 replace with:

var r = CreateResponse(sm, true, indexes, true, true);

Just let me know if you do it so that I can update it to my side as well.

Mar 08, 2021 at 5:32 PM Notified 13 people

Morgan Maguire, CEO

Ok. Thanks for the update

Harsh

.

Ketan

, can you please ensure you discuss this with

Harsh

(and

Radomir

and

Rob

if required) so that we can determine when all dtSearch related features will be complete. Ideally we should have this project wrapped up at the beginning of next week so that

Melissa

and

Naomi

have an opportunity to perform further testing on the tool before launch on April 1st.

Thanks,

Morgan

Mar 08, 2021 at 6:46 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Morgan

Please confirm that we will render only matching keyword parent and child branches in SN module. so

Radomir

can change the logic and provide update to us.

Please confirm.

Mar 09, 2021 at 5:05 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Once Morgan Confirms that only matching branches result we need then we need to change the logic only Selectednodes result we need with highlighted word itself.

Morgan

, Please confirm.

Mar 09, 2021 at 6:25 AM Notified 13 people

Radomir Mladenovic

Harsh

please note that the most time consuming part of handling the subjects navigator search request is highlighting. If you get 700+ results, getting them all highlighted right away takes some time. As I already suggested, I think it's better to use non-highlighted results to build the results page quickly, then use "highlight-subject-nav" to get progressively highlighted fields for a subset of nodes (displayed on page), and update the page elements.

Mar 09, 2021 at 8:25 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

As We discussed, my team fully occupied with other application pending stuff which we need to complete before 25th March. so now we can not change anything with current build. Please do something that we can get only selectednodes result with highlighted word it self.

It means we need take selctednode result by using that selectednode, we need all its branchid results in response with highlighted word.

Mar 09, 2021 at 8:28 AM Notified 13 people

Radomir Mladenovic

Harsh

can you please clarify "get only selectednodes result"? You want to remove "hierarchicalparentids" and enable highlighting as it was before?

Mar 09, 2021 at 9:15 AM Notified 13 people

Harsh Parikh, Tech Lead

Yes

Radomir

.. We need get only selectednodes all branch id result.

For example, if you search with "abuse of process" then get all rows selectednodid and by using selectednodeid provide all those branchid result.

Like, selectednodedid are (1,5,8,11,15,16) then we need each branchid result of (1,5,11,15,16). It means we need to render only matching keyword Parent and child branches with enable highlighting word.

But, First of all we need to take confirmation from Morgan. so please wait for Morgan 's reply first.

Mar 09, 2021 at 9:30 AM Notified 13 people

Morgan Maguire, CEO

Hi

Harsh

,

I'm not following what you're requesting me to approve. Please provide me with a concrete example of how this affects the display of the results.

Also, it sounds like we're pushing UI work onto

Radomir

. Please ensure that we're not using his services for work that should be done by your team.

Radomir

was engaged to build the customized indexes and advise your team on implementation, so let's ensure we're not expanding that scope.

Thanks,

Morgan

Mar 09, 2021 at 1:40 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Morgan

,

We would prefer to take call with

Radomir

for SN & DD module search along with Melisa as want to fix the requirements which we can quickly (within 1 to 2 days) integrate into system.

Let's schedule a call tomorrow 10th March 8:15 AM Vancouver time. so please co-ordinate with

Radomir

and schedule Meeting.

Mar 09, 2021 at 2:42 PM Notified 13 people

Morgan Maguire, CEO

Ok. That's fine

Harsh

. I'll setup the call. However, I have a very full schedule this week, and I'll need to limit the call to 30 minutes. So please send any details in advance of the call so that we can get right to the issues.

Thanks,

Morgan

Mar 09, 2021 at 2:52 PM Notified 13 people

Morgan Maguire, CEO

I've sent out a calendar invite for a call tomorrow at 8:15am Vancouver time between

Harsh

,

Ketan

,

Melissa

,

Radomir

and myself. Let me know anyone else needs to join and I can add them to the calendar invite.

Thanks,

Morgan

Mar 09, 2021 at 5:09 PM Notified 13 people

Rob Wiesenberg

Morgan

please send me an invitation as well. Thanks.

Mar 09, 2021 at 5:37 PM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, I updated the search service for the discussed subjects navigator search changes: disabled highlighting and collecting only "selectednodes". You can find it in the 2021-03-10 folder.
I'll have a new disputes search method for you later tonight or tomorrow morning.

Mar 10, 2021 at 5:56 PM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, I updated the search controller in the 2021-03-10 folder with a new method: "disputes-details"
You need to send it the same request (with the query and filters) as to the "disputes", extended with the additional filtering fields:

    "FieldFilterName": "disputeid",
    "FieldFilterValues": [12877]

I hope this does what you needed it to do. Let me know if you have any question.

Mar 10, 2021 at 9:16 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The subject Navigator is OK now. But, For Dispute Document module if we pass only English language then it is taking so much time to give response back and also as discussed we don't want highlight the word in Dispute Document module also.

Following is my search request when I filter with only English language.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["234"]}]},"SortField":"FullCitationText","SortOrder":"asc"}

Please look into this and let us know.

Mar 11, 2021 at 4:34 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Another thing like we send following request we are not able to get any result.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["235"]},{"type":"match","field":"Field_109","values":["377"]}]},"SortField":"FullCitationText","SortOrder":"asc"}

Mar 11, 2021 at 5:17 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We have implemented the new method : "disputes-details". But, I have seen that you provided result from first stored proc.

We need result from second strode proc by passing Dispute Id SelectDisputeContegraSearch.

Mar 11, 2021 at 7:07 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, I guess I misunderstood what you need for the dispute-details. To return results from the second stored procedure, I'll need to make changes to the indexer and create a separate index for the second procedure data.
As of the dispute search performance, I guess it's because you're searching without a keyword so you're getting a long list of results. (I believe the highlighting is already off.) Can you please send me the timeLog from the response. It should show the actual time and number of records collected.

Mar 11, 2021 at 7:49 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Please don't change anything. You don't need to create seprate index by using following code in your SearchController for disputes-details method is fine. We are passing same request (with the query and filters) as to the "disputes", extended with the additional filtering fields:

[HttpPost("disputes-details")]
public IActionResult SearchDisputesDetails([FromBody] SearchModel sm)
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();

List<string> indexes = new List<string>() { Settings.Tologix.DisputesIndex };
ApiError err = Search(sm, indexes, false);
if (err != null)
{
return new ObjectResult(err);
}
var searchTime = stopwatch.ElapsedMilliseconds;
bool highlight = false; // sm.SearchRequest != "xlastword" && !string.IsNullOrWhiteSpace(sm.SearchRequest);
var r = CreateResponse(sm, false, indexes, highlight, false);
r.TimeLog.Insert(0, "search: " + searchTime);

Stopwatch stopwatch2 = new Stopwatch();
stopwatch2.Start();

// get distinct DisputeId values in results
WordListBuilder wordListBuilder = new WordListBuilder();
wordListBuilder.OpenIndex(Settings.Tologix.DisputesIndex, indexCache);
wordListBuilder.SetFilter(sm.ResultsAsFilter);
int values = wordListBuilder.ListFieldValues("DisputeId", "*", 10000);
log.LogDebug("Found " + values + " disputes (wordListBuilder.Count = " + wordListBuilder.Count + ")");
List<string> disputeIds = new List<string>();

for (int i = 0; i < wordListBuilder.Count; ++i)
{
String word = wordListBuilder.GetNthWord(i);
int docCount = wordListBuilder.GetNthWordDocCount(i);
disputeIds.Add(word)
;
//log.LogDebug("- " + word + " " + docCount);
}
List<string> disputewithcontentypedatamaster = new List<string>();
for (int j = 0; j < disputeIds.Count; j++)
{
disputewithcontentypedatamaster.AddRange(disputeIds[j].Split(','));
}
// get all documents for found disputes
if (indexes != null && disputewithcontentypedatamaster.Count > 0)
{
var disputesResults = new List<ResultDocument>();
FindByField("ContentTypeDataMasterId", disputewithcontentypedatamaster, new List<string>() { Settings.Tologix.DisputeDocsIndex },
disputeRes =>
{
for (int i = 0; i < disputeRes.Count; ++i)
{
disputeRes.GetNthDoc(i);
disputesResults.Add(createResultDocument(disputeRes.CurrentItem));
}
});
r.Results = disputesResults;
}
r.TimeLog.Add("collect nodes for disputeId (" + r.Results.Count + "): " + stopwatch2.ElapsedMilliseconds);

stopwatch.Stop();
r.TimeLog.Add("total: " + stopwatch.ElapsedMilliseconds);

return Ok(r);
}

And I think it works OK. But when we search with only language then it takes long time and still highlighting is on.

Can we take one small call so we both remain on same page.

Mar 11, 2021 at 9:42 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

If We search with only English language the following JSON we are pasing.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["234"]}]}}

The Postman tool reach out maximum time when we pass above search request.

image.png 185 KB • Download

Mar 11, 2021 at 9:51 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Sorting is also not working after getting result in Dispute Document module. We are passing sorting filed and type in json request but result is not filtered.

IF I want to filter FullCitation by descending order then we are passing following json request.

Sort Filed : FullCitation
Sort Order : desc

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_109","values":["403"]}]},"SortField":"FullCitationText","SortOrder":"desc"}

Mar 11, 2021 at 10:14 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Have you gone through above Dispute & Document module queries ? And, Also Could you let us know that FTS module is done from your side ?

As, We are planning to deploy WebAPI and Indexer on server by early next week.

Mar 12, 2021 at 5:15 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, I was in travel most part of the day yesterday so didn't have time to check the Disputes search. Will do it today.
As of the FTS, I'm not aware of any open issues so I consider it complete from my side.

Mar 12, 2021 at 6:21 AM Notified 13 people

Harsh Parikh, Tech Lead

OK Thanks

Radomir

for update. Let me know if you need anything for Dispute Document module.

Mar 12, 2021 at 6:27 AM Notified 13 people

Harsh Parikh, Tech Lead

Radomir

, Please take note that we have changed disputes-details method in search controller.

Here, I have attached updated SearchController and both Json file of Dispute Document module.

SearchController.cs 46.4 KB • Download

indexer-config-dispute-docs.json 480 Bytes • Download

indexer-config-disputes.json 502 Bytes • Download

Mar 12, 2021 at 6:30 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, thanks, I copied your changes.

I'll try to address the issues in order you reported them:

1) Response taking too long for filter only search:

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["234"]}]},"SortField":"FullCitationText","SortOrder":"asc"}

You could increase timeout in Postman - on my system the request takes about 78 seconds. However, the problem with this query is that it gives way too many results - there are 231956 records returned for initially found 5783 matches! The response is over 200MB!

I think it doesn't make sense to return all results. With a a few different searches, someone could scrape your whole database. I think we should put a hard limit of say 1000 (or 10K) items and never return more than that. Better give an error to user to refine the query.

Mar 12, 2021 at 8:26 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We need DisputeId first value only which we pass to second query for result. Suppose DisputeIds are (12877,1500,1501,1502) so you can fetch always first value of DisputeID column (12877) and pass to second query.

Because first we need to display only dispute node and then on click we call our second method (disputes-details) to get all data.

I am available for call to discuss.

Mar 12, 2021 at 8:35 AM Notified 13 people

Radomir Mladenovic

Harsh

I don't understand changes you made to the "disputes-details". You just copied the whole "disputes", there are no other changes. What's the point? If you can still use the "disputes"with just modified fields search, then let's do that and avoid code duplication.

I'm available for a call atm. Will call you in a few minutes.

Mar 12, 2021 at 8:37 AM Notified 13 people

Radomir Mladenovic

Harsh

I'm attaching modified controller with discussed changes for the "disputeId" handling in the first and the second disputes search method.

SearchController.cs 43.1 KB • Download

As mentioned in the call, sorting is now broken because the sort is applied to the first dtSearch search call when we collect the "disputeId". In the rest of the method we collect those dt documents having the disputeId. However, as we're hitting dt search query size limit (70K, because we need to enumerate all the different disputeIds) this results fetching is run in batches so the sorting cannot be applied in dtsearch.

I think I saw that you have only a couple of different fields here for which you use sorting, both referenced with SortField. Is that correct? In that case I can implement sorting in the controller, after fetching all the results. I cannot finish it today but hopefully you can have this tomorrow afternoon.

Mar 12, 2021 at 9:32 AM Notified 13 people

Harsh Parikh, Tech Lead

OK Thanks

Radomir

.. I hope the Performance issue is resolved when you pass only English language. Please correct me.

And, Let us know once you completed Sorting through Controller and provide update to us once you complleted.

Mar 12, 2021 at 10:10 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

I have replaced the Search Controller and checked with pass only English Language. But, It is still taking time for getting response.

Mar 12, 2021 at 10:21 AM Notified 13 people

Radomir Mladenovic

Harsh

, yes, I saw that. But that's expected. Still too many nodes. In fact, I went to debug I saw that for this case none of matched records didn't have more than 1 disputeId in the field so the change didn't affect this at all.

Mar 12, 2021 at 10:40 AM Notified 13 people

Harsh Parikh, Tech Lead

OK

Radomir

. If we make keyword search mandatory then it will resolve this issue. (it means user should have to entre keyword with any other filters).

What your suggestion. It will work ?

Mar 12, 2021 at 10:48 AM Notified 13 people

Radomir Mladenovic

Harsh

I suggest to add some absolute limit anyway - e.g. 1000 items. If keyword is a mandatory and the user enters some very common word, we're back to the same problem.

Mar 12, 2021 at 11:18 AM Notified 13 people

Harsh Parikh, Tech Lead

Radomir

, I am not getting about absolute limit. I assume you set limit up to 1000 items Right ?

Mar 12, 2021 at 11:23 AM Notified 13 people

Radomir Mladenovic

Yes, I think we should stop somewhere. You cannot return 200K results to the user, as it happens now with your metadata example.

Mar 12, 2021 at 11:28 AM Notified 13 people

Harsh Parikh, Tech Lead

Radomir

, We can not implement Paganization or lazy loading as of now before launch. so we need to find alternate solution

Mar 12, 2021 at 12:54 PM Notified 13 people

Radomir Mladenovic

I see limiting results at some point as an alternative solution, which is very simple to implement.

Mar 12, 2021 at 1:54 PM Notified 13 people

Radomir Mladenovic

Harsh

there are updated files in the 2021-03-12 folder, with the following changes:

1) Added sorting by field after collecting all results in Disputes search. (BTW,in your sorting example, the field should have been "FullCitation", instead of "FullCitationText".)

2) Added hard-stop limit for the number of results returned for the Disputes and Subjects Navigator search. It's optional but, if you want to enable it, add "ResultListStopCount" to the Tologix section:

{
  "SearchSettings": {
    "Tologix": {
      "ResultListStopCount": 10000,
      ...

Mar 12, 2021 at 9:47 PM Notified 13 people

Harsh Parikh, Tech Lead

Thanks

Radomir

. Will change the sort field name and set as FullCitation.

But, I am still confuse about hard-stop limit for the number of results. Could you explain more for this ?

What is the meaning of ResultListStopCount ?

Mar 13, 2021 at 5:45 AM Notified 13 people

Radomir Mladenovic

Harsh

, until you implement pagination, ResultListStopCount gives you an option to limit number of results returned - so that user cannot pull 200K results which will take an eternity to render anyway. I see it as a mean to prevent unnecessary load on the service. If you don't want to use it, it's fine as well.
When this limit is reached, the search service will stop collecting results, it will return what was collected until that point, and will set "HardStop" flag in the response. You could use this flag in your app to tell user that the search is too broad and to refine it.

Mar 13, 2021 at 9:23 AM Notified 13 people

Darsh Shah

Hi

Radomir

,

For the FTS module when we search with nationality word the following PDF's file paragraph 28 number we get in result.
But, when we click on 28 number, we are not able to find the text from PDF files.
We are facing this issue in other files also.
Following, I have attached one of the PDF and Video file. Please check and confirm.

PUB-37-3 - History of ICSID Convention - vol II-1b.pdf 7.88 MB • Download

Record_2021_03_15_11_04_11_156.mp4 2.93 MB • Download

Mar 15, 2021 at 5:39 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, I sent you a fix to the 2021-03-15 folder.
Note that the page you sent me is not page 28. This is a PDF document so the results are pages, not paragraphs. The page 28 contains "nationality":

image.png 59.3 KB • Download

Mar 15, 2021 at 8:07 AM Notified 13 people

Radomir Mladenovic

Darsh

sorry, the above message should have been addressed to you, not Harsh.

Mar 15, 2021 at 8:11 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Many places we found that when we click on Page number it doesn't populate result so this issue resolved for generally or for this specific PDF file ?

Mar 15, 2021 at 8:28 AM Notified 13 people

Radomir Mladenovic

Harsh

the fix addresses similar cases as well.

Mar 15, 2021 at 8:41 AM Notified 13 people

Harsh Parikh, Tech Lead

OK Thanks

Radomir

for quick reply. Will integrate and check and let you know.

Mar 15, 2021 at 8:43 AM Notified 13 people

Darsh Shah

Hi

Radomir

,

For FTS module when we search with "Like" word, On the result some documents display same paragraph multiple times.

Please check the attached video for more information.

Record_2021_03_16_11_56_14_698.mp4 5.64 MB • Download

Also, please find attached PDF and HTML files for your reference.

BIT-0081 - Albania-United States BIT (excerpts)-1.html 16 KB • Download

BIT-0081 - Albania-United States BIT (excerpts)-1.pdf 107 KB • Download

Mar 16, 2021 at 6:29 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Today we have deployed latest version on server (10.68.138.10). but we are not able to find Paragraph number in FTS module.

image.png 245 KB • Download

Could you please look into urgently ?

Following are things you can check on server (10.68.138.10)

DBIndexerProjectPath : E:\DevContegraISLGRebuildStagingDBIndexer

Indexes : E:\DevContegraISLGRebuildStagingIndexes

DocumentPath : E:\ISLGRebuildStaging\wwwroot\Documents

Mar 16, 2021 at 11:49 AM Notified 13 people

Morgan Maguire, CEO

Hi

Rob

and

Radomir

,

Further to most recent results for the searches in Dispute & Dispute Documents: Re: Dispute Documents search field does not produce any results - TOLOGIX - ISLG App Rebuild, the Subject Navigator: Re: Problem with Subject Navigator search field, live site - TOLOGIX - ISLG App Rebuild and the issues with the FTS above, I am very concerned with the lack of progress in finding resolution on this project.

Could you please ensure that you connect with

Harsh

and

Ketan

immediately to determine a solution to these problems.

Thanks,

Morgan

Mar 16, 2021 at 12:31 PM Notified 13 people

Rob Wiesenberg

Morgan

, yes we will continue to work with the team to get these resolved.

Mar 16, 2021 at 12:33 PM Notified 13 people

Morgan Maguire, CEO

Ok. Thanks

Rob

.

Morgan

Mar 16, 2021 at 12:35 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Today when we deployed TologixDBIndexer and WebAPI on server and we found that for FTS module indexing are created smaller size rather than our local environment.

As example, On server (10.68.138.10), in FTS Para folder the index_r_1 size is 61,000 KB and in our local it was created 4,63,030 KB.

Is there an issue ? as we have deployed same thing which we used in our local environment.

The Subject Navigator and Dispute Document indexes created successfully and working as per our criteria but in FTS module we are not able to get Paragraph or Page number.

Please look into this as soon as possible as we need to release this module for UAT by tomorrow.

Mar 16, 2021 at 2:08 PM Notified 13 people

Naomi Joanis, UX Team Lead

Hi

Radomir

,

Following up on some of the bugs I logged earlier, I'm still not sure if the search is working as expected.

#22162 - Synonyms > Not Working

I understand dt search is using the WordNet synonyms for their search, through there I verified that "Bias" & "Prejudice" are synonyms (http://wordnetweb.princeton.edu/perl/webwn?s=bias&sub=Search+WordNet&o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&h=). If I search bias in the SN with synonyms checked off, I am not seeing results for prejudice as I am in the legacy application.

image.png 311 KB • Download

image.png 149 KB • Download

#22160 - Stemming > Not Working

Similarly, it seems like stemming is not working properly. When stemming is checked off, I am not seeing results for "Likely" when I search for "like" for example.

image.png 262 KB • Download

image.png 162 KB • Download

#22141 – Any words > Search doesn't disregard words such as "and" or "or"

The results take minutes to populate, whereas in the legacy application the results appear within seconds. See videos for more details
- Screen Recording 2021-03-16 at 10.20.28 AM.mov 2.27 MB • Download
- Screen Recording 2021-03-16 at 10.19.26 AM.mov 7.21 MB • Download

#22155 - All Words

If I select "All words" and enter "education award" I expect to see matches where all the words entered are present for there to be a match. This doesn't seem to be occurring in the current application.
image.png 185 KB • Download
image.png 132 KB • Download

Please let me know if anything is unclear with these issues, or if I'm performing invalid scenarios.

Mar 16, 2021 at 3:08 PM Notified 13 people

Radomir Mladenovic

Harsh

, for me FTS paragraphs index size is 1.4 GB. I suspect that URL to the PDF Highlighter in your config file is no longer valid. You said it worked before so maybe something changes in the IIS config. You reference Highlighter using:

"PdfHighlighterUrl": "http://10.68.138.10/highlighter/",

But I get 404 error page when I open http://10.68.138.10/highlighter/

As you're running this directly on the server, there's no need to go through IIS. Try using:

"PdfHighlighterUrl": "http://10.68.138.10:8998",

Hope this helps. Let me know.

Mar 16, 2021 at 3:08 PM Notified 13 people

Radomir Mladenovic

Hi

Naomi

,

Harsh

,

#22162 - I already commented this one on Mar 6 and asked you to check the thesaurus config on the legacy server so that we figure out which thesaurus you're using. I didn't get any feedback on this.
Or, if you have source code of the legacy search application that can be helpful as well in figuring out thesaurus options currently used.

#22160 - As before, looks good on my end. Testing this in the Subjects Navigator index, "like" gives 273 results with stemming off, and 392 results with stemming enabled. (My screenshot sent on Mar 6 also shows this working.)
Maybe the web application is not sending the stemming checkbox value to the search service properly.

#22141 - How many results are you pulling in the result page in the legacy application? The problem is the new application is trying to get all results at once. When I test it, I get more than 32000 results and the response is 70MB. And that's without stemming and fuzzy that you have enabled in your search.

#22155 - I'm not sure I understand this one. From the comment in your screenshot, I'd say you expect only those results containing all keywords in the branch name only. Is that correct? If true, that's a new requirement for me. We index all meta fields in order to support filtering but I don't think that dtSearch supports search requests limited to a single field (e.g. branch name). I'll check this.

Mar 16, 2021 at 4:49 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Could you please let us know which PDF Highliter url we will use on server from following two url?

http://10.68.138.10:8998/highlighter

or

http://10.68.138.10:8998

Mar 16, 2021 at 5:10 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Could you please provide sample of search request for steamming word in subject navigator module? so we can check our service request by tomorrow.

Mar 16, 2021 at 5:11 PM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, sorry if I wasn't clear. You should use http://10.68.138.10:8998
I tried the one you're using (http://10.68.138.10:8998/highlighter) and that doesn't work properly and create much smaller index.

As of the stemming, the syntax is the same since introduced and you already have it many examples posted:

{
    "searchRequest": "education award",
    "SearchType": "AllWords",
    "Stemming": true,
    "Synonyms": false,
...

Mar 16, 2021 at 5:34 PM Notified 13 people

Radomir Mladenovic

Harsh

Naomi

please confirm how the Subjects Navigator search should work - related to my comment about issue #22155 above.
dtSearch does not support search requests limited to a single field using a search query and boolean/anywords/allwords options. We could workaround this by transforming the query to fielded search (https://support.dtsearch.com/webhelp/dtsearch/field_searching.htm) but I need a confirmation from your end as it's an extra development.

Mar 16, 2021 at 5:41 PM Notified 13 people

Harsh Parikh, Tech Lead

Naomi

, Could you please provide reply to above radomir's comment for bug no. 22155 as you have requitements for those search result.

Mar 16, 2021 at 5:52 PM Notified 13 people

Radomir Mladenovic

Harsh

, regarding the issue with paragraphs shown multiple times, I cannot reproduce it. The result I'm getting for the sample HTML you sent me shows only one paragraph:

image.png 25.2 KB • Download

and extracted paragraphs don't contain the other "copy" of the paragraph. Here are the all paragraphs extracted:

test4-new-extracted.html 8.44 KB • Download

Please notice that in your video paragraphs are not exactly the same. Some have an extra dot, some have parenthesis, etc. I suspect that you didn't delete the old index after I provided updated indexer so paragraphs extracted by the new indexer were added on a top of what was already in the index (and the old paragraphs were not overwritten because paragraph ID format was modified).
Delete all three FTS indexes, index it and I think it will be fine.

Mar 16, 2021 at 6:11 PM Notified 13 people

Morgan Maguire, CEO

Hi

Radomir

and

Harsh

,

Re #22155, I don't think there is a problem running the search request across different fields; however, as

Naomi

described in the user story, the results in the new ISLG do not include the result from legacy application: https://www.investorstatelawguide.com/ResearchTools/SubjectNavigator?toc=content&id=50&tab=r&search=education+award&searchType=all&stem=1&thes=&ftypo=1&fuzziness=1

image.png 94 KB • Download

Why isn't this branch included within the All Words search for "education award" within the new ISLG when it fits the parameters of the search? http://staging.investorstatelawguide.com/SubjectNavigator/Index?branchid=6UQPkRs5-Qc%3D

image.png 108 KB • Download

Thanks,

Morgan

Mar 16, 2021 at 8:11 PM Notified 13 people

Radomir Mladenovic

Hi

Morgan

, sorry, I cannot tell from your screenshots what's different. I think I see the same branches. I don't have login for neither the current production nor staging to compare live websites.
When I run this search in my environment, using staging data, I think I see the same document:

image.png 94.3 KB • Download

Mar 16, 2021 at 9:02 PM Notified 13 people

Morgan Maguire, CEO

That's odd

Radomir

. I've given you access to the subscriber side of the staging environment, you should have receive an automated email prompting you to activate your account. The subject navigator search is located here: http://staging.investorstatelawguide.com/SubjectNavigator/Index

Harsh

, do you have any idea why the results on staging.islg are different from what

Radomir

has shown above?

Morgan

Mar 16, 2021 at 11:54 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Morgan

,

As per your provided screen shot both scrrenshot have same result. what is the different? both screenshot display same result with eduction award

Mar 17, 2021 at 2:16 AM Notified 13 people

Morgan Maguire, CEO

Hi

Harsh

,

The results on staging.islg currently don't include the result highlighted by

Radomir

above: http://staging.investorstatelawguide.com/SubjectNavigator/Index?searchType=1&searchRequest=education%20award&isStemming=1&isSynonyms=0&isFuzzy=1&fuzzyTypo=1

image.png 146 KB • Download

Thanks,

Morgan

Mar 17, 2021 at 2:51 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

As per above comment Morgan we are passing following search request to for Subject Navigator but we are not able to get that branch which Morgan saw us.

Could you please look into this search request and suggest us. We are able to take call and resolve this issue.

{
"ErrorMessage":null,
"WasError":false,
"SearchRequest":"education award",
"PageNum":0,
"PageSize":0,
"Fuzzy":true,
"Fuzziness":1,
"Stemming":true,
"WordNetSynonyms":false,
"Synonyms":false,
"PhonicSearching":false,
"SearchType":1,
"SortField":null,
"SortOrder":null,
"SearchFlags":0,
"Custom":null,
"NoFrames":false,
"EnableDateSearch":false,
"StartDate":null,
"EndDate":null,
"FileConditions":null,
"BooleanConditions":null,
"QueryStatement":null,
"FilterStatement":null,
"Facets":null,
"IxId":null,
"IndexIds":null,
"IncludeSynopsis":true,
"Near":14,
"ExcludeEnabled":false,
"ExcludeTerm":null,
"TreePath":null,
"paraId":null,
"FieldFilterName":null,
"FieldFilterValues":null
}

Mar 17, 2021 at 5:36 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, I'm getting the same results using your request payload. I'm attaching the complete response.
Maybe there's something wrong with your index? My index is 260MB in size.
I'll give you a call in a few minutes.

response.json 52.1 KB • Download

Mar 17, 2021 at 7:09 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Thanks for taken Call.

Naomi

, Following is the comment of SN bugs.

#22162 - I already commented this one on Mar 6 and asked you to check the thesaurus config on the legacy server so that we figure out which thesaurus you're using. I didn't get any feedback on this.
Or, if you have source code of the legacy search application that can be helpful as well in figuring out thesaurus options currently used. - It is resolved on staging.islg with help of

Radomir

. You can check.

#22160 - As before, looks good on my end. Testing this in the Subjects Navigator index, "like" gives 273 results with stemming off, and 392 results with stemming enabled. (My screenshot sent on Mar 6 also shows this working.)
Maybe the web application is not sending the stemming checkbox value to the search service properly. - It is resolved on staging.islg with help of

Radomir

so yo can check it. You can check

#22141 - How many results are you pulling in the result page in the legacy application? The problem is the new application is trying to get all results at once. When I test it, I get more than 32000 results and the response is 70MB. And that's without stemming and fuzzy that you have enabled in your search. -

Radomir

will need to look into it.

#22155 - I'm not sure I understand this one. From the comment in your screenshot, I'd say you expect only those results containing all keywords in the branch name only. Is that correct? If true, that's a new requirement for me. We index all meta fields in order to support filtering but I don't think that dtSearch supports search requests limited to a single field (e.g. branch name). I'll check this. It is resolved on staging.islg with help of

Radomir

so yo can check it.

Mar 17, 2021 at 9:31 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

When we get the Response from Subject Naviagtor the brnachname will cut. We will not getting whole brnach name. We checked in our SQL View and we provided you full Branch name.

For Example, when you search with abuse of process you get BranchId 19867 in result where branch name you get as follows :

Philip Morris v. Australia Award on Jurisdiction and Admissibility considers that the initiation of a treaty-based investor-State arbitration constitutes an abuse of rights (or an abuse of process, the rights abused being procedural in nature) when an investor has changed its corporate structure to gain the protection of an investment treaty at a point in time when a specific dispute was foreseeable; a dispute is foreseeable when there is a reasonable prospect that a measure which may give rise to

But Actual Branch Text is :

Philip Morris v. Australia Award on Jurisdiction and Admissibility considers that the initiation of a treaty-based investor-State arbitration constitutes an abuse of rights (or an abuse of process, the rights abused being procedural in nature) when an investor has changed its corporate structure to gain the protection of an investment treaty at a point in time when a specific dispute was foreseeable; a dispute is foreseeable when there is a reasonable prospect that a measure which may give rise to a treaty claim will materialize

The last line "a treaty claim will materialize" is cuted from your response.

Please check and confirm.

Mar 17, 2021 at 9:35 AM Notified 13 people

Radomir Mladenovic

Harsh

, I copied the "wordnet" folder with the synonyms database to your dtsearch config folder on the server.
WordNet is already supported by the search controller. However, instead of using "Synonyms" in the search model, you need to use "WordNetSynonyms", as in:

{
    "SearchRequest": "like",
    "SearchType": "AllWords",
    "WordNetSynonyms": true,
    "PageSize": 200,
    "Fuzzy": false,
    "Fuzziness": 1,
...

That should be all you need. I tested it on my side and synonyms work.

Mar 17, 2021 at 9:38 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

In which folder you copied "wordnet" folder on server ? as we can't see an folder on indexing folder.

Also, We need to change Synonyms name in your WebAPI Project's SearchModel ?

Mar 17, 2021 at 9:45 AM Notified 13 people

Radomir Mladenovic

Harsh

, you don't need to change anything in the SearchModel - WordNetSynonyms already exists. Just send it in request when you want search with thesaurus.
WordNet folder location:

image.png 13 KB • Download

Again, you don't need to change anything here as we already setup the config folder during our call.

Mar 17, 2021 at 10:16 AM Notified 13 people

Harsh Parikh, Tech Lead

OK Thanks

Radomir

.. Please let us know why the branch text is cut when we get response from Search.

Mar 17, 2021 at 10:37 AM Notified 13 people

Radomir Mladenovic

Harsh

it looks like we hit some dtSearch default length limit with the branch text. I increased the limit to the max of 8192 characters and building now a new index to test this. I hope to have an update on this within an hour.

Mar 17, 2021 at 10:42 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Morgan

,

The bug no. 22155 to wrong result for education award issue is resolved on staging.islg with help of

Radomir

. You can check and confirm.

Mar 17, 2021 at 11:26 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, the issue with cut off text is fixed after bumping up the limit in dtsearch. Get the updated indexer and the search app update from the 2021-03-17 folder.
I already created Subjects Nav index with this and copied for you to E:\DevContegraISLGRebuildStagingIndexes\subject-nav-stage

Mar 17, 2021 at 11:52 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Could you please explain to

Morgan

regrading Dispute Document Search Performance issue which we discussed today in call.

Mar 17, 2021 at 12:21 PM Notified 13 people

Radomir Mladenovic

Harsh

Morgan

I already commented this on Mar12:
... the problem with this query is that it gives way too many results - there are 231956 records returned for initially found 5783 matches! The response is over 200MB!
I think it doesn't make sense to return all results. With a a few different searches, someone could scrape your whole database. I think we should put a hard limit of say 1000 (or 10K) items and never return more than that. Better give an error to user to refine the query.

Mar 17, 2021 at 12:32 PM Notified 13 people

Naomi Joanis, UX Team Lead

Hi

Radomir

,

Thanks for your help/responses to the above, since these bugs have moved back to UAT I'll follow up on the outstanding issue I'm seeing.

#22162 - Synonyms > Not Working

Moved this card to done in TargetProcess

#22160 - Stemming > Not Working

Moved this card to done in TargetProcess

#22141 – Any words > Search doesn't disregard words such as "and" or "or"

Still in Progress with Radomir/DevIT

#22155 - All Words

The branch that was previously missing (under the letter D) appears, however there is an additional branch under the letter P that doesn't include both of the keywords that I'm searching with.

image.png 152 KB • Download

Mar 17, 2021 at 2:26 PM Notified 13 people

Radomir Mladenovic

Hi

Naomi

, as of #22155, the word "award" appears in the document field "documenttypes": "Partial Awards or Decisions on the Merits"

Mar 17, 2021 at 3:45 PM Notified 13 people

Morgan Maguire, CEO

The update results for "edecuated award" look good to me based on

Harsh

's explanation above. However,

Naomi

could run a few more tests to make sure the results are accurate.

Thanks,

Morgan

MarkedDone in TP

Mar 17, 2021 at 4:03 PM Notified 13 people

Morgan Maguire, CEO

Following-up on

Radomir

's comments above:

... the problem with this query is that it gives way too many results - there are 231956 records returned for initially found 5783 matches! The response is over 200MB!
I think it doesn't make sense to return all results. With a a few different searches, someone could scrape your whole database. I think we should put a hard limit of say 1000 (or 10K) items and never return more than that. Better give an error to user to refine the query.

I don't want to impose limits on the results, because this will create problems for users if they need to cast broad net searches. Also, this still doesn't explain why the search is so slow for the following example:

Entered "1 January 2020" into From Date field
Entered "31 March 2021" into To Date field
Select Submit Search
Search took 30 second to produce results with only 208 matches is result list.

image.png 147 KB • Download

This type of search should not be taking this long.

Thanks,

Morgan

Mar 17, 2021 at 4:11 PM Notified 13 people

Radomir Mladenovic

Harsh

can you please send me the request payload for the above date range search?

Mar 17, 2021 at 4:28 PM Notified 13 people

Rob Wiesenberg

Naomi

, I just spoke with

Morgan

and he agreed that it would be helpful if you could test all of the various search filters and confirm that search performance is fast. If you encounter any slowness, please report back with the sample query that you used for testing.

Mar 17, 2021 at 6:59 PM Notified 13 people

Morgan Maguire, CEO

Hi

Naomi

,

Following up on

Rob

's comment above, could we run tests on the Dispute & Dispute Documents search to ensure all the filters are performing at a satisfactory level from a performance perspective. Similar, to my post above, we need to ensure searches for documents using the different filtering options we generate typical searches in a timely manner.

This would include,

Language: English
Applicable Arbitration Rules: ICSID Arbitration Rules (all versions)
Applicable Treaty: NAFTA Chapter 11
Date Range: all documents from the past year
Document Type: Final Awards
Respondent State: Canada

Could you post the time it takes to generate each result. My expectation is that none of these searches should take more than 4-5 seconds to generate a result.

Thanks,

Morgan

Yes, will do!

Mar 17, 2021 at 10:05 PM Notified 13 people

Rob Wiesenberg

Morgan

, just to clarify further... because the system is currently coded to retrieve all results, search of the Dispute documents will still be slow in cases where they return more than 20k results. We can discuss further on our call tomorrow.

Mar 17, 2021 at 10:32 PM Notified 13 people

Morgan Maguire, CEO

Understood,

Rob

. But I don't think that's the case with any searches above. For example applying the Language: English produces 719 matches and it took more than 30 seconds to generate the results. Therefore, the volume of the search results shouldn't be the problem.

image.png 146 KB • Download

Thanks,

Morgan

Mar 17, 2021 at 11:20 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Following is the payload request of Dispute Document module when user filter with only date range.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"boolean","Operator":"or","clauses":[{"type":"range","field":"Field_61","from":"20191231","to":"20210330"},{"type":"range","field":"Field_110","from":"20191231","to":"20210330"}]}]},"SortField":"FullCitation","SortOrder":"asc"}

Mar 18, 2021 at 5:41 AM Notified 13 people

Radomir Mladenovic

Morgan

, you're absolutely right that 30 seconds for 719 results is too much. The problem is for those 719 results way more data is pulled from the search index. I'll try to explain on the case of a date range search provided by Harsh in the previous message.

The search itself reports 670 results and is executes in about 150ms! However, as requested by the dev team, we don't return these results but do one more step: we collect all different disputeIds that appear in these results and return all results where this matched ContentTypeDataMasterId. This is blown to 96859 results to be collected and returned. I really don't know the data model of the application but something is very fishy to me here - I don't think that almost 100K nodes is used to represent 670 results in page. At the end, this takes tens of seconds.

I think too much pressure and expectation is put here on the search service, to do the job it's not really ideal for. The real search is the one that found data in less than 200ms. I think the second step of the search is more appropriate job for the database.

Harsh

, I'd suggest taking a step back here and re-organizing some things. Instead of doing the second step in search, I'd suggest that for the Disputes search we only return collected disputeId's (e.g. as a list in the SelectedNodes) and you get data you need from the database. As we're not doing highlighting here, there's no advantage in getting these results from the search index. In addition, you could use a caching layer in your application to cache node data and make it even more efficient.

Mar 18, 2021 at 7:51 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Can we take call to discuss above things as we are not clear what you want to make search faster ?

Will take call and discuss and finalized the solution.

Mar 18, 2021 at 8:03 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We have solution that if you provide me first search result from only Dispute Index then it is enough for us because we need only DisputeId (FirstId) and node name only.

After getting result, when user click on node name then will remains our second call Dispute-doc method as it is because that call doesn't take as much time.

Let me know so we can take quick call and conclude that.

Mar 18, 2021 at 8:21 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Thanks for call.

As discussed, We need first search result from Dispute Indexes where only 10000 rows you can get.

As finalized, We provided you 2 columns in FE_MetafieldwithValueDynamic query .

DisputeCitation
IsDisputeCitation

DisputeCitation column is nodename of dispute and In IsDisputeCitation column we set 1 where DisputeCitation available and for other rows we set 0 where DisputeCitation is null.

You need to return only those DisputeId and DisputeCitation column Where IsDisputeCitation column set 1

Our Second Search call will be reamin as it is where we pass all DisputeId collection and get result.

We have updated FE_MetafieldwithValueDynamic query on server for databse ISLGRebuildStaging.

Let me know if you have any query.

Mar 18, 2021 at 9:17 AM Notified 13 people

Radomir Mladenovic

Harsh

Thanks. I'llbuild new index and starting making changes for this. I'll let you know if I have any question.

Mar 18, 2021 at 10:10 AM Notified 13 people

Radomir Mladenovic

Harsh

the indexer is breaking now because FE_MetafieldwithValueDynamic doesn't provide the RowId column any more. Is there another column that should be used as an identifier now? If now, can you please fix the RowId?

Mar 18, 2021 at 11:29 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We have updated the FE_MetafieldwithValueDynamic query on server. Now you can get RowId as identifier.

Mar 18, 2021 at 11:42 AM Notified 13 people

Radomir Mladenovic

Harsh

I modified the search controller as discussed. I'm getting results for the date filter in less than 0.5s. I hope it returns all data you were expecting.
You can get the update form 2021-03-18 folder.

Mar 18, 2021 at 1:51 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

If you passed following date filter then you can get result near about 208 dispute count ?

Entered "1 January 2020" into From Date field
Entered "31 March 2021" into To Date field

Mar 18, 2021 at 1:57 PM Notified 13 people

Naomi Joanis, UX Team Lead

Hi

Radomir

,

Yesterday I tested the FTS feature and there were some bugs that arose to bring up here:

#22515 - Stemming > Doesn't Work

check off stemming
type in "like"
enter search
select the first document results Methanex Corporation v. United States of America, UNCITRAL, Transcript of Hearing on Jurisdiction and Admissibility, 11 July 2001
View the document pdf and ctl f search for "likely" (it appears on page 493)
Go back to search results and select page excerpt for 493

Result:The word likely isn't highlighted

Expected:Will be highlighted if stemming is enabled

#22517 – Synonym > Doesn't Work

Set synonyms to on
search "bias"
View first result
View page 17 excerpt

Result:The word prejudice exists but isn't highlighted

Expected:Will be highlighted

Mar 18, 2021 at 2:22 PM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, yes, I'm getting 208 results for that range query.

Mar 18, 2021 at 2:58 PM Notified 13 people

Radomir Mladenovic

Hi

Naomi

#22515 - Stemming > Doesn't Work

I think you might be mixing "physical" PDF page with the number shown on the page. Search results show the physical page numbers. On such page 493 there's "likeliness" I think but shows also page number 495 in the text. That means that the page 493 you were looking at is 491 physical. I opened it and it shows "likely" highlighted:

image.png 66.4 KB • Download

Mar 18, 2021 at 3:12 PM Notified 13 people

Radomir Mladenovic

#22517 – Synonym > Doesn't Work
I've found and fixed one issue related to this. It should be fine in the search service update I provided.

Mar 18, 2021 at 3:59 PM Notified 13 people

Naomi Joanis, UX Team Lead

Hi

Radomir

,

Re #22515 - Stemming > Doesn't Work

I understood from your response that I should be looking at the page excerpt for 491 instead of 493 where the word "likely" appears in the pdf physical numbers which makes sense. When I look at the results card in FTS I'm not seeing the page excerpt for 491. I am only seeing a result on 490 for "likeness". Based on your screenshot this seems like I should have an excerpt for page 491?

Screen Recording 2021-03-18 at 1.22.53 PM.mov 12.2 MB • Download

Mar 18, 2021 at 5:27 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Please let us know once you will resolve above bug no. 22515 & 22517 for FTS module and provide update to us.

Mar 19, 2021 at 8:21 AM Notified 13 people

Radomir Mladenovic

Harsh

,

I don't see any issue with these:

#22517 - the example below shows both "bias" and "prejudice" highlighted when searching for "prejudice"

image.png 132 KB • Download

#22515 - yesterday I sent a screenshot showing highlighting for stemming.

Harsh

make sure you send necessary search options (Stemming and WordNetSynonyms) to "highlight-para" if you want them to work. If it still doesn't work for you, please send me your request payload.

Mar 19, 2021 at 9:12 AM Notified 13 people

Harsh Parikh, Tech Lead

Radomir

, I will send you Payload request within half n hour for above 2 issues.

Mar 19, 2021 at 9:28 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Following is the payload request of above 2 bugs.

Bug no. 22515

{"searchRequest":"like","SearchType":"Boolean","Stemming":true,"WordNetSynonyms":false,"Fuzzy":false,"Fuzziness":"1","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"DocumentContentTypeId","values":["37","13","12"]}]},"PageNum":0,"PageSize":20}

Bug No : 22517

{"searchRequest":"bias","SearchType":"Boolean","Stemming":false,"WordNetSynonyms":true,"Fuzzy":false,"Fuzziness":"1","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"DocumentContentTypeId","values":["37","13","12"]}]},"PageNum":0,"PageSize":20}

Could you please check and let us know.

Mar 19, 2021 at 9:56 AM Notified 13 people

Radomir Mladenovic

Harsh

these are search payloads. In the above issue I believe we're talking about paragraph highlighting ("highlight-para" service method).

Mar 19, 2021 at 9:58 AM Notified 13 people

Harsh Parikh, Tech Lead

OK My mistake

Radomir

. I will provide you.

Mar 19, 2021 at 10:00 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Following is my search request for highlight para.

Bug no. : 22515

{"searchRequest":"like","SearchType":"3","Stemming":"true","Synonyms":"false","Fuzzy":"false","Fuzziness":"1","paraId":"ECB6D91BB3177C32B5E0B70F4E5AC7C1#MTQ="}

Bug no. : 22517

{"searchRequest":"bias","SearchType":"3","Stemming":"false","Synonyms":"true","Fuzzy":"false","Fuzziness":"1","paraId":"80D3583E561ED87838529EE310CB553A#MTM="}

Mar 19, 2021 at 10:07 AM Notified 13 people

Radomir Mladenovic

Harsh

#22515 - in the sample you gave me, only "like" appears. please provide another example showing that stemming is not working.

#22517 - you're sending Synonyms instead of WordNetSynonyms. It works when you put WordNetSynonyms=true.

Mar 19, 2021 at 10:22 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Could you please look into this the last video of Morgan in following basecamp thread.

Dispute Documents search field does not produce any results - TOLOGIX - ISLG App Rebuild

And Provide your feedback for Dispute Document Search.

Following is my Payload request as per Morgan's video data.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["234","235"]},{"type":"match","field":"Field_DocumentTypeId","values":["1064","1067"]},{"type":"match","field":"Field_109","values":["487"]},{"type":"boolean","Operator":"or","clauses":[{"type":"range","field":"Field_61","from":"20200229","to":"20210320"},{"type":"range","field":"Field_110","from":"20200229","to":"20210320"}]}]},"SortField":"FullCitation","SortOrder":"asc"}

Mar 22, 2021 at 7:05 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

In FTS module Contegra Team Raised one issue.

Bug No. : 22502

Steps to reproduce:

Go to FTS
Set document type to arbitration rules and treaties
Select "international Centre for Settlement of Investment Disputes" as arbitration rule type
Select Free Trade Agreement (FTA) in treaty type
Enter the keyword "tribunal"
View results

Result:No results show, even though there are results for this arbitration rule type

Expected:Will see results

Following is my Search request for above scenarion.

{"searchRequest":"tribunal","SearchType":"Boolean","Stemming":false,"WordNetSynonyms":false,"Fuzzy":false,"Fuzziness":"1","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_27","values":["368"]},{"type":"match","field":"Field_34","values":["44"]},{"type":"match","field":"DocumentContentTypeId","values":["13","12"]}]},"PageNum":0,"PageSize":20}

I assume we need to set or condtiton between metafield.

Please suggest.

Mar 22, 2021 at 7:37 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, as of the filtering example in Disputes, I don't see anything strange in your search request. However, I don't know which name is which in the app so hard to say what's wrong. What are field names of the filters that don't work?
Does filtering work if you use only filters on one of these fields?
Are you sure these fields are included in the FE_MetafieldwithValueDynamic?

Mar 22, 2021 at 8:41 AM Notified 13 people

Radomir Mladenovic

Harsh

as of the FTS search, I see that Field_27 and Field_34 don't have results with AND. Search works when I filter only on one of them but I didn't see the other field in the results.
I'm not sure if you need to use OR, from the UI it doesn't look like OR would be the expected behavior.
Make sure you're using the right field name.

Mar 22, 2021 at 8:55 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

As per Morgan's Video, If we applied following filter with other filters then it does not work.

Respondent State : [Field_109] : value 487
Applicable Instrument(s) : [Field_69] : value 11333
Applicable Arbitration Rules : [Field_70] : value 11086

Above all data available in FE_MetafieldwithValueDynamic result. But Morgan said that above field with apply with other filter combination then doesn't work but if apply only those above filed filter then it works.

Let me know if you need to take call. I am available.

Mar 22, 2021 at 9:00 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

For FTS bug, If we apply filter indvisual with tribunal word. For example,

Only Search [Field_27] with tribunal word then it produce result. 22 Result count

Only Search [Field_34] with tribunal word then it produce result. 16 Result count.

But, Naomi expectation is if we applied both filters then we should get 38 result count.

so what we need to change in JSON search query.

Mar 22, 2021 at 9:14 AM Notified 13 people

Radomir Mladenovic

Harsh

as filters work in some other cases, the first thing I'd check here is data. Can you find a row in the stored proc results that contains data matching both fields? If the data is there, please send me the IDs so that I can track it down in the indexing log and idnex.

Mar 22, 2021 at 10:10 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Are you asking for FTS bug ?

Mar 22, 2021 at 10:21 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

IF you available then can we take call to resolve both Dispute Document and FTS bug ?

Mar 22, 2021 at 10:37 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

For FTS Bug Following is just example of ROWId,

[Field_27] available in RowId (1,2,3)

[Field_34] available in RowId (535,558)

We need all these rowid in result.

Mar 22, 2021 at 11:23 AM Notified 13 people

Radomir Mladenovic

Harsh

, as per your previous message, the data is the problem. For FTS, one result comes from one row. If you don't have both fields in a single row, then how can it appear in results when combined using AND?

I can also confirm this differently... When I a run search with Field_34 only, I get 16 results as you said. I'm attaching result JSON and you can see there's no Field_27 in results at all. So, it was not in the data that was indexed.

res-34.json 205 KB • Download

Mar 22, 2021 at 11:39 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Yaa I know that both [Field_34] and [Filed_27] data are not availble with Combined AND.

so that's why I asking to how to set OR condition in Payload request so will get result.

Following is my current JSON request

{"searchRequest":"tribunal","SearchType":"Boolean","Stemming":false,"WordNetSynonyms":false,"Fuzzy":false,"Fuzziness":"1","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_27","values":["368"]},{"type":"match","field":"Field_34","values":["44"]},{"type":"match","field":"DocumentContentTypeId","values":["13","12"]}]},"PageNum":0,"PageSize":20}

so, is there any way to set OR condition in above Payload request so we can get all 38 result count.

Mar 22, 2021 at 11:47 AM Notified 13 people

Radomir Mladenovic

Harsh

sorry, I didn't understand because this type of query was discussed already (e.g. my comment from Sep 14).
Just nest the OR clause within the AND:

{
    "searchRequest": "tribunal",
    "SearchType": "Boolean",
    "Stemming": false,
    "WordNetSynonyms": false,
    "Fuzzy": false,
    "Fuzziness": "1",
    "FilterStatement": {
        "type": "boolean",
        "Operator": "and",
        "clauses": [
            {
                "type": "boolean",
                "Operator": "or",
                "clauses": [
                    {
                        "type": "match",
                        "field": "Field_27",
                        "values": [
                            "368"
                        ]
                    },
                    {
                        "type": "match",
                        "field": "Field_34",
                        "values": [
                            "44"
                        ]
                    }
                ]
            },
            {
                "type": "match",
                "field": "DocumentContentTypeId",
                "values": [
                    "13",
                    "12"
                ]
            }
        ]
    },
    "PageNum": 0,
    "PageSize": 20
}

This one returns 38 results.

Mar 22, 2021 at 11:56 AM Notified 13 people

Harsh Parikh, Tech Lead

OK Thanks

Radomir

..

Will try this and will update you.

Mar 22, 2021 at 12:05 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Could you provide feedback of Dispute Document Result ?

-------

As per Morgan's Video, If we applied following filter with other filters then it does not work.

Respondent State : [Field_109] : value 487
Applicable Instrument(s) : [Field_69] : value 11333
Applicable Arbitration Rules : [Field_70] : value 11086

Above all data available in FE_MetafieldwithValueDynamic result. But Morgan said that above field with apply with other filter combination then doesn't work but if apply only those above filed filter then it works.

Let me know if you need to take call. I am available.

Mar 22, 2021 at 12:06 PM Notified 13 people

Radomir Mladenovic

Harsh

, the same thing as with the FTS. Please find a database row in FE_MetafieldwithValueDynamic where those fields are present together. If they are not expected to be together, then you probably need to OR them. I cannot answer this question for you as I don't know your data model.

Mar 22, 2021 at 12:46 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

For Dispute Document module, Suppose I have search with Russian Language and Lithuania Respondent State then I should get result.

We don't need to apply OR operator the operator should be and.

Following is the Payload Search Request for Russian Language and Lithuania Respondent State:

Field_109 : Respondent State (value : 487)
Field_62 : Language (value : 248)

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["248"]},{"type":"match","field":"Field_109","values":["487"]}]},"SortField":"FullCitation","SortOrder":"asc"}

In FE_MetafieldwithValueDynamic Query you get Field_109 value in RowId : 1181 and Field_60 value in RowId : 8829

Both Rows First Value DisputeId is 13585. so we expect to return DisputeId RowId : 1181 from First Seacrh.

And From Second Search you should pass the DisputId Collection [13585,21375] and provide result.

Please let us know.

Mar 22, 2021 at 1:45 PM Notified 13 people

Radomir Mladenovic

Harsh

you say:

In FE_MetafieldwithValueDynamic Query you get Field_109 value in RowId : 1181 and Field_60 value in RowId : 8829

Exactly that's why it doesn't work with AND. The fields do not belong to the same indexed row. What do you want me to do here? Any workaround that I could apply will result in a significant performance degradation. The fix should be in the database view.

As the first search is pulling data for DisputeIds, I'd say you need to think about changing FE_MetafieldwithValueDynamic in a way that the DisputeId is unique in the results (so that it can be used as an identifier instead of RowId) and you include all fields related to it. That way both Field_109 and Field_60 should appear in the same row.

If that's too complicated on your end, maybe we could make it in two steps:

Change FE_MetafieldwithValueDynamic so that DisputeId is used as a row identifier, but include only fields which you can.
Make an additional view/proc/query that returns all custom fields for a given disputeId? It could return multiple rows, doesn't matter.

Then, I could change the indexer to run the second query for each row that appears in the FE_MetafieldwithValueDynamic, and add all found fields to the same indexed document. The indexer will probably run much longer but at least everything would be indexed properly.

I hope this makes sense.

P.S. I will be out of the office the whole morning tomorrow and can get back to you only during my afternoon.

Mar 22, 2021 at 3:20 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We can not make DisputeId as unique identifier as multiple documents have associated with single dispute. so it is possible that you can get multiple dispute Id.

For the second option for two step process lets discuss today as we have only this week to complete this task.

Let us know once you are available to take call and finalized.

We are available between 10:00 AM to 6:00 PM IST.

Mar 23, 2021 at 4:40 AM Notified 13 people

Radomir Mladenovic

Harsh

as I said in my previous message, I'm not available today during your work hours. I'm going on a trip in few minutes and will only be available later afternoon/tonight Europe time, when I'm back.
From everything discussed these months, I hope you understand how indexing and search work. Please provide a view (or views) that will allow us to index all metadata that you need associated with a single row in the first results table, as that will be search result item.

Mar 23, 2021 at 5:37 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

It is not possible to set all metafield in single Row as search is looking for different rows.

We should go with two process as per your suggestion but we need to confirm and discuss which data will provide you in view/stored procedure.

Let us know when you are available to take call as IST time ?

Mar 23, 2021 at 5:43 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

As per discussion in today's call for Dispute Document Filter.

The Initial Search Query FE_MetafieldwithValueDynamic will remain as it is. We haven't changed anything. It means you provided intial search request result from this Indexing.

As per discussion, we made new query FE_SelectDocumentViewForContegraSearch on server (dtabse : ISLGRebuildStaging) which contains all Dispute Document Metafield and DisputeID.

You can use FE_SelectDocumentViewForContegraSearch as second step query for Initial Search .

Please let me know if you have any query.

Mar 24, 2021 at 8:59 AM Notified 13 people

Radomir Mladenovic

Harsh

please add a disputeId parameter to the FE_SelectDocumentViewForContegraSearch so that it returns data for a single disputeId only.

Also, I need a separate procedure that returns only disputes! The FE_MetafieldwithValueDynamic can remain as is for the "dispute"details" call. However, for indexing, I need one procedure that returns only disputes with their metadata. Using disputeId from each row, I'd pass it to FE_SelectDocumentViewForContegraSearch to get dispute documents for each dispute.

Mar 24, 2021 at 9:34 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

You can find the DisputeId column (3rd Column) in this stored procedure FE_SelectDocumentViewForContegraSearch

image.png 259 KB • Download

The new separate Procedure for only Dispute MetaField is : FE_SelectDisputeViewForContegraSearch

But, Please make sure you will provide result form this query
FE_MetafieldwithValueDynamic as all sorting field and DisputeId Contains filed are availbele in this query.

Mar 24, 2021 at 9:42 AM Notified 13 people

Radomir Mladenovic

Harsh

I know that the disputeId is in the results. I need results to return data only for a particular disputeId. This method will be called for each row in the FE_SelectDisputeViewForContegraSearch.
Another option is that we read data from FE_SelectDisputeViewForContegraSearch once, cache it, and then only filter data when needed. This will be much faster but will take more RAM. I'm not sure how many rows are in this. If you think that the complete dta set can be kept in memory, we can proceed this way.

Next, the new FE_SelectDisputeViewForContegraSearch doesn't even have column disputeId. Please fix or let me know how to use it.

Mar 24, 2021 at 9:50 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We set DisputeId column in FE_SelectDisputeViewForContegraSearch.

Both query has full dataset.

Mar 24, 2021 at 9:54 AM Notified 13 people

Radomir Mladenovic

Harsh

I uploaded the indexer and search service update to the 2021-03-24 folder.

The indexer config for disputes should be updated - IndexStoredProcTables was modified, and DisputeDocsMetadataProc is the newproperty:

  "IndexStoredProcTables": [ "FE_SelectDisputeViewForContegraSearch" ],
  "DisputeDocsMetadataProc": "FE_SelectDocumentViewForContegraSearch",

Just a reminder: delete the old index folder before indexing.

Let me know how it worked for you.

Mar 24, 2021 at 10:15 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The above changes we made and create new indexes but it doesnt work.

I don't understand why you removed FE_MetafieldwithValueDynamic from indexer because we need initial search result from this indexing.

The Following 2 query we provided you to for jus filter the data and pass the DisputeID in this indexing FE_MetafieldwithValueDynamic so we get all desired result.

FE_SelectDisputeViewForContegraSearch
FE_SelectDocumentViewForContegraSearch

Because all our model columns and sorting fields columns for initital search are available in this query FE_MetafieldwithValueDynamic.

Please let me know once you are available to so we can take call and discuss.

Mar 25, 2021 at 4:58 AM Notified 13 people

Radomir Mladenovic

Harsh, please explain how it doesn't work. I understood for the first search you need only dispute info, and that on click you'll get the rest using "dispute-details".
If you prefer the old result format, we can keep index from the old procedure and use it for results after collecting disputed from this search.

Mar 25, 2021 at 6:37 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

If I Search with Canada then it gives 0 result. Previously, It given 34 dispute record.

Now, Any search gives me null result.

Following is my search request.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_109","values":["403"]}]},"SortField":"FullCitation","SortOrder":"asc"}

But, still I don't get how you will provide result because all columns which i need in my model are availbel in this query FE_MetafieldwithValueDynamic and now you are not indexing this query result.

Please will take call and finalize.

Mar 25, 2021 at 6:43 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

,

Please check the things on your end. For the query you sent me, I'm getting 34 results. Response attached.

canada-response.json 326 KB • Download

Mar 25, 2021 at 7:00 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

This is my modified Dispute Index config file. Please check is it OK ?

indexer-config-disputes.json 583 Bytes • Download

Mar 25, 2021 at 7:04 AM Notified 13 people

Radomir Mladenovic

Harsh

Not sure why you have DocIdColumnName: "ContentTypeDataMasterID" in your config. I have "DocIdColumnName": "RowId" but it's like that for long time, not a new change.

Mar 25, 2021 at 7:10 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Can we take quick call as i think there is some confusion for indexing.json file. ?

Mar 25, 2021 at 7:12 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Could you provide your indexing json file for both Dispute & Dispute Document. so we can check again.

Mar 25, 2021 at 7:16 AM Notified 13 people

Radomir Mladenovic

Harsh

I'm attaching the config files I'm using.

indexer-config-dispute-docs-staging.json 529 Bytes • Download

indexer-config-disputes-staging-new.json 638 Bytes • Download

Mar 25, 2021 at 7:25 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The First Initial Search is working now. But when we pass the second method disputes-details it doesn't give result of Dispute Document Data.

Here my search request for disputes-details service method.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["234"]}]},"SortField":"FullCitation","SortOrder":"asc","FieldFilterName":"disputeid","FieldFilterValues":[12877]}

As You remembered that we passed disputedid collection to our second query SelectDisputeContegraSearch to get dispute detail & document result.

Please let us know.

Mar 25, 2021 at 10:40 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Could we connect for above query or you are looking into it ?

Mar 25, 2021 at 1:05 PM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, I'm looking into this but have some doubts about your use of filter in the second call. I'll call you on skype.

Mar 25, 2021 at 1:14 PM Notified 13 people

Radomir Mladenovic

Harsh

updated indexer and search service are in the 2021-03-25 folder.

Regarding your previous search example, are you sure you provided a good disputeId? For disputeId 12877 I don't see data in the database.

image.png 26 KB • Download

Mar 25, 2021 at 10:18 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

When I copied SearchController and run the API project it gives me foolowing error.

Severity Code Description Project File Line Suppression State
Error CS1061 'TologixSettings' does not contain a definition for 'DisputeComboIndex' and no accessible extension method 'DisputeComboIndex' accepting a first argument of type 'TologixSettings' could be found (are you missing a using directive or an assembly reference?) TologicWebSearch D:\Harsh\Harsh\Contegra Projects\TologixWebSearch\Controllers\SearchController.cs 503 Active

We have also update app.config file and set DisputeCombo Index unnder TologixSetting.

image.png 240 KB • Download

Is ther anything missed fron your side ?

Mar 26, 2021 at 6:58 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Following line in SearchController display red line.

image.png 259 KB • Download

Mar 26, 2021 at 6:59 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Still we don't get Dispute Document Detail Data from SelectDisputeContegraSearch. As we discussed, we need to pass all dispute and Disputedocument Id which we provided in this query FE_MetafieldwithValueDynamic (DisputeId column).

Currently we can get only dispute detail when we call our second method disputes-details.

Mar 26, 2021 at 7:09 AM Notified 13 people

Radomir Mladenovic

Harsh

I added AppSettings.cs to the same folder but I guess you already fixed that.

Mar 26, 2021 at 8:23 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

As discussed, I have update Column DisputeId in following stored procedure on our server for Database ISLGRebuildStaging.

FE_SelectDocumentViewForContegraSearch

image.png 303 KB • Download

As an example, For First Search you pass following JSON Request :

First Request :

{"searchRequest":"A.M.F. Aircraftleasing Meier & Fischer GmbH & Co. KG v. Czech Republic, PCA Case No. 2017-15, Respondent Press Release, 5 December 2016 [Czech]","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["15541"]}]},"SortField":"FullCitation","SortOrder":"asc"}

When Click on that second method to fetch all Dispute Detail and Document detain in second request :

Second Request :

{"searchRequest":"A.M.F. Aircraftleasing Meier & Fischer GmbH & Co. KG v. Czech Republic, PCA Case No. 2017-15, Respondent Press Release, 5 December 2016 [Czech]","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["15541"]}]},"SortField":"FullCitation","SortOrder":"asc","FieldFilterName":"disputeid","FieldFilterValues":[13406]}

I need result for both ContentTypeDataMasterId Row (13406 and 20758) from SelectDisputeContegraSearch .

13406 is our Dispute Detail
20758 is Dispute DocumentDetail

Mar 26, 2021 at 9:16 AM Notified 13 people

Radomir Mladenovic

Hi

Harsh

, I sent you an update in the 2021-03-26 folder. For indexing, note that the config was updated as well. I added "ContentTypeDataMasterId" to the "FacetedFields".

Let me know if this returns expected results.

Mar 26, 2021 at 11:45 AM Notified 13 people

Harsh Parikh, Tech Lead

OK

Radomir

. I will integrate and check and get back to you.

Mar 26, 2021 at 11:53 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The Search is working but one issue we found that if any dispute haven't any document then it doesn't fetch dispute data form second result.

Can we take quick call ?

Mar 26, 2021 at 2:01 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

The First Request is working now. The Second request also working but for some dispute when I pass second request it gives me bad request error and doesn't provide result.

This is my first request and it gives me 186 result and it is fine.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"boolean","Operator":"or","clauses":[{"type":"range","field":"Field_61","from":"20200229","to":"20210324"},{"type":"range","field":"Field_110","from":"20200229","to":"20210324"}]}]},"SortField":"FullCitation","SortOrder":"asc","FieldFilterName":"disputeid","FieldFilterValues":[22997]}

Second Request. When I click on dispute the second request pass then it doesn't give result it throws bad request error.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"boolean","Operator":"or","clauses":[{"type":"range","field":"Field_61","from":"20200229","to":"20210324"},{"type":"range","field":"Field_110","from":"20200229","to":"20210324"}]}]},"SortField":"FullCitation","SortOrder":"asc","FieldFilterName":"disputeid","FieldFilterValues":[22997]}

image.png 194 KB • Download

The Following second request is working.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"boolean","Operator":"or","clauses":[{"type":"range","field":"Field_61","from":"20200229","to":"20210320"},{"type":"range","field":"Field_110","from":"20200229","to":"20210320"}]}]},"SortField":"FullCitation","SortOrder":"asc","FieldFilterName":"disputeid","FieldFilterValues":[12400]}

Mar 26, 2021 at 2:14 PM Notified 13 people

Radomir Mladenovic

Harsh

I updated the indexer, it's in the same folder. As you suspected, it was related to a dispute that has no documents.

Mar 26, 2021 at 3:01 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

That was my assumption that which dispute have no documnet then it creatr issue.

But, I will check tomorrow.

Could you please provide folder name where you put updated indexer?

Mar 26, 2021 at 3:26 PM Notified 13 people

Radomir Mladenovic

Harsh

it's under 2021-03-26, I updated the installer there.

Mar 26, 2021 at 3:34 PM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

With this new indexer the Dispute Document Indexing is not created. It is generating the error.

Here, I have attached the Indexing log file and Indexer config file. The First Dispute Indexing is generated but the Second DisputeDocumentIndexing is not generated. Please check.

image.png 226 KB • Download

indexing.log 5.07 KB • Download

indexer-config-dispute-docs.json 480 Bytes • Download

indexer-config-disputes.json 753 Bytes • Download

Mar 27, 2021 at 4:58 AM Notified 13 people

Radomir Mladenovic

Harsh

I fixed the indexer and it's under the 2021-03-27 folder.

Mar 27, 2021 at 9:31 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Hope your doing good.

Naomi

has raised 2 issues in FTS module.

Bug no. 22939

Steps to reproduce:

Go to FTS
enter "t" in keyword space
In filter dispute documents by > applicable treaty > select "Agreement between Japan and India for an Economic Partnership (2011) (excerpts)
Enter search

Result:No results found

Expected:Will see the two published dispute documents where the dispute has this instrument

Following the Payload request for above criteria and we found that we are getting 2 Rows from FE_MetafieldwithValueDynamicFTS query.

{
"searchRequest":"t",
"SearchType":"Boolean",
"Stemming":true,
"WordNetSynonyms":false,
"Fuzzy":true,
"Fuzziness":"1",
"FilterStatement":{
"type":"boolean",
"Operator":"and",
"clauses":[
{
"type":"boolean",
"Operator":"or",
"clauses":[
{
"type":"boolean",
"Operator":"and",
"clauses":[
{
"type":"match",
"field":"DocumentContentTypeId",
"values":[
"13"
]
}
]
},
{
"type":"boolean",
"Operator":"and",
"clauses":[
{
"type":"match",
"field":"DocumentContentTypeId",
"values":[
"37"
]
}
]
},
{
"type":"boolean",
"Operator":"and",
"clauses":[
{
"type":"match",
"field":"DocumentContentTypeId",
"values":[
"12"
]
}
]
}
]
},
{
"type":"boolean",
"Operator":"or",
"clauses":[
{
"type":"match",
"field":"Field_69",
"values":[
"12401"
]
}
]
}
]
},
"PageNum":0,
"PageSize":20
}

Could you please check and confirm.

Mar 31, 2021 at 1:59 PM Notified 14 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Also, One more bug raised by

Naomi

.

Bug No. 22938

Steps to reproduce:

FTS
Enter "t" in the keyword search
In the filter by results in other research tools section, select "Act concerning the conditions of accession of the Republic of Croatia to the European union (2011) (citation and source)" from Search treaty/instrument section
Submit search
View results

Result:There are no documents found

Expected:Will see documents that are underneath this instrument in the AC

Following the Payload request for above criteria and we found that we are getting 2 Rows from FE_MetafieldwithValueDynamicFTS query.

{
"searchRequest":"t",
"SearchType":"Boolean",
"Stemming":true,
"WordNetSynonyms":false,
"Fuzzy":true,
"Fuzziness":"1",
"FilterStatement":{
"type":"boolean",
"Operator":"and",
"clauses":[
{
"type":"boolean",
"Operator":"or",
"clauses":[
{
"type":"boolean",
"Operator":"and",
"clauses":[
{
"type":"match",
"field":"DocumentContentTypeId",
"values":[
"13"
]
}
]
},
{
"type":"boolean",
"Operator":"and",
"clauses":[
{
"type":"match",
"field":"DocumentContentTypeId",
"values":[
"37"
]
}
]
},
{
"type":"boolean",
"Operator":"and",
"clauses":[
{
"type":"match",
"field":"DocumentContentTypeId",
"values":[
"12"
]
}
]
}
]
},
{
"type":"boolean",
"Operator":"or",
"clauses":[
{
"type":"match",
"field":"Field_ACReference",
"values":[
"22968"
]
},
{
"type":"match",
"field":"Field_ACProvision",
"values":[
"22968_Generally"
]
}
]
}
]
},
"PageNum":0,
"PageSize":20
}

Mar 31, 2021 at 2:01 PM Notified 14 people

Radomir Mladenovic

Harsh

Naomi

what do you expect to match with the keyword "t"? Does it appear in texts? If I remove the keyword, I get 2 results. Also, I can get results using "c" - as I see that "C" appears in the filenames.

Mar 31, 2021 at 8:56 PM Notified 14 people

Harsh Parikh, Tech Lead

Naomi

, Please provide your feedback of above

Radomir

comment as soon as possible.

Mar 31, 2021 at 11:37 PM Notified 14 people

Naomi Joanis, UX Team Lead

Hi

Radomir

,

Maybe this is a mistake on my end, but I would expect to see around the same type of matches I would see in the current version, where any documents that contain the letter "t" would appear if they apply to the filter criteria. Wouldn't the search match what is shown in the document text?

Screen Shot 2021-03-31 at 8.02.48 PM.png 287 KB • Download

Screen Shot 2021-03-31 at 8.02.05 PM.png 276 KB • Download

Apr 01, 2021 at 12:03 AM Notified 14 people

Morgan Maguire, CEO

Hi

Harsh

,

Naomi

and

Radomir

,

Naomi

is correct to be concerned with the difference in results between the legacy app and the new application, and we should get to the bottom of why that is, particularly when no filter is applied.

However, further to the video below, the results for the specific bug referenced above probably have to do with the fact that the 2 documents that are generated by the filter do not have HTML documents available in staging.islg.

At the same, as described in the video, I noticed that we're displaying paragraph references vertically and horizontally. Currently, this is very inconsistent and I generally dislike the vertical alignment. Please ensure this is resolved in a way that is consistent.

Thanks,

Morgan

Problem with FTS (31-Mar-21).mp4 10.5 MB • Download

Apr 01, 2021 at 3:30 AM Notified 14 people

Harsh Parikh, Tech Lead

Hi

Naomi

,

Please look into the issue of vertical alignment UI. We found that if Page or Paragraph count is less than 12 then it shows vertical.

Apr 01, 2021 at 6:30 AM Notified 14 people

Radomir Mladenovic

Naomi

I'm not sure if "t" is a valid search. The full text search doesn't search for a letter "T" appearing in text, but word "T". Search for "C" brings back the results so maybe "t" as a word is not present in the documents indexed. I don't know why the legacy app find these documents but possibly some difference in the content. It would require more thorough investigation comparing content, indexing, testing, etc but not sure if worth resources as I don't see "t" as a meaningful search keyword.

Apr 01, 2021 at 8:06 AM Notified 14 people

Harsh Parikh, Tech Lead

Hi

Morgan

and

Naomi

,

The vertical alignment Page/Paragraph UI issue we resolved and uploaded on staging.islg.

Apr 01, 2021 at 8:53 AM Notified 14 people

Harsh Parikh, Tech Lead

Hi

Morgan

and

Naomi

and

Radomir

,

I think above both issue due to missing data & PDF, HTML file on staging.islg. Today we have generated the indexing on app.islg and it produce the result with "t" letter.

Apr 01, 2021 at 11:55 AM Notified 14 people

Morgan Maguire, CEO

Ok. thanks

Harsh

. That would explain a significant difference in the results.

Naomi

let's perform testing on app.islg where we have a much more complete set of HTMLs.

Also, let's come up with a better UI solution for the alignment of the paragraphs.

Naomi

,

Melissa

and

Savannah

, could you please ensure

Harsh

is provided with updates to the template code today.

Thanks,

Morgan

Apr 01, 2021 at 2:37 PM Notified 14 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We have set up indexing for go live on 6th April but when we are trying to generate Indexing for FTS module following error is logged and indexes are not created.

Following, I have attached Indexing Log and indexer json file for live databse. Also, Please note that we are using following Live database on Server for generate indexing.

Databse Name : ISLGRebuildProduction

indexing.log 2.16 KB • Download

indexer-config-fts.json 952 Bytes • Download

Please check and provide feedback.

Apr 04, 2021 at 11:09 AM Notified 14 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

Please ignore above comment. There was issue from our side in query and we are looking into it.

Apr 04, 2021 at 12:32 PM Notified 14 people

Morgan Maguire, CEO

Hi

Harsh

,

I noticed the DD search is currently not working on app.islg. Is this related to the indexing issues above?

Thanks,

Morgan

Apr 04, 2021 at 5:49 PM Notified 14 people

Harsh Parikh, Tech Lead

Hi

Morgan

,

There was minor issue to generate index on app.islg. We have resolved it and all 3 modules search is working fine on app.islg.

Apr 05, 2021 at 6:20 AM Notified 14 people

Comments & Events