TOLOGIX - Contegra Search Audit

Search Implementations

Hello all,

I had a call with Rob Wiesenberg, Contegra Rob this afternoon about implementing the recommendations outlined in their report: Tologix ISLG - App Audit Review - 2019-08-31.pdf - TOLOGIX - Contegra Search Audit. I've decided I would like Rob Wiesenberg, Contegra Rob and his team to get directly involved in the implementation process by building the custom indexers required for all the subscriber side searches for the new ISLG and ILG applications. The Contegra team will get involved when the SQL databases are finalized (i.e., after the admin sites are complete) and the UI for the searches is finalized (i.e., after the subscriber site designs are complete), which is scheduled to be complete by the end of October.

In the meantime, to ensure everyone has a clear understanding of our how roles are going to be delegated,  Rob Wiesenberg, Contegra Rob is going to produce an implementation plan that we'll circulate and review during the next team meeting scheduled for Thursday, September 26th.

Please let me know if you have any questions or concerns.

Thanks,

Morgan

Comments & Events

Morgan Maguire, CEO
Hello all,

Further to my message above, Rob Wiesenberg, Contegra Rob and this team are going to get involved in building our front-end searches for ISLG and ILG. They will begin development after the UI features and the SQL databases are complete in October. At that time, Contegra will be able to thoroughly review the new database structures and understand the scope of work needed to complete the necessary tasks.

At this point, we anticipate Contegra performing the following:

Custom Indexer Service: Contegra will create dtSearch indexes containing both the full text and the relevant SQL fielded information, which will include:
  1. a command line interface to create a new index from scratch; and
  2. the ability to check the file system directories for new or changed documents and process updates.
Web Service: Contegra will create a web service, which will include:
  1. creating a web service that will handle all search requests from the client applications (ISLG and ILG);
  2. allowing the client to communicate with the web service using REST-like interface, returning data in JSON format; and 
  3. creating a web-service based on .Net that uses IIS. 
After these above services are complete, Contegra will then assist us in:
  1. setting up requirements for periodic merging of updates and optimizing indexes;
  2. setting up index backup requirements; and
  3. creating protocols for error handling.
Please let me know if you have any questions about the above.

In the interim, Jitesh Dhuravala, DevIT Jitesh please review and let us know if you foresee any problems with integrating these plans into your existing development timelines.

Thanks,

Morgan 
Morgan Maguire, CEO
Hi Jitesh Dhuravala, DevIT Jitesh ,

Following up on my note above, could you please confirm whether you foresee any problems with integrating these plans into your existing development timelines.

Thanks,

Morgan 
Jitesh Dhuravala, DevIT
Hi Rob Wiesenberg, Contegra Rob and Radomir Mladenovic, Contegra Radomir ,

We are planning to start dtSearch implementation in ILG as we have need dtsearch indexes for subscriber side search functionality. Before we start, It is good if we take one call and than make it start in order to achieve things which we were discussed in our last call.


Thanks,
Jitesh
Morgan Maguire, CEO
Hi Jitesh Dhuravala, DevIT Jitesh ,

I had a call with Rob Wiesenberg, Contegra Rob this afternoon, and before we schedule a call with him and Radomir Mladenovic, Contegra Radomir , we should get together some documentation that gives them an opportunity to review the requirements for the subscribers side searches, so that they can be more informed when they ask and answer questions.

Melissa Cowell, General Manager at Industrial Melissa , we discussed previously about getting a document that outlines all the different searches available on the subscriber side. Could you share that document here for Rob Wiesenberg, Contegra Rob to review. Also, if possible could you ensure the document includes links to the relevant users stories and wireframes, so that the requirements can be easily reviewed. 

Thanks,

Morgan 
Melissa Cowell, General Manager at Industrial
Morgan Maguire, CEO Morgan  

Sounds good. I can provide this by the end of the week. 

Mel
Morgan Maguire, CEO
OK. Thanks Melissa Cowell, General Manager at Industrial Melissa .

Jitesh Dhuravala, DevIT Jitesh and Rob Wiesenberg, Contegra Rob , should we schedule the call next week then?  Would Tuesday, March 10th at 8am work for you? Note that I'd like to be on the call to confirm how roles are getting delegated.  

Thanks,

Morgan
Jitesh Dhuravala, DevIT
Hi Morgan Maguire, CEO Morgan ,

Tuesday, 10th March Holiday for us. Please schedule call after 10th any other day.

Thanks,
Jitesh
Morgan Maguire, CEO
Right. Of course, Jitesh Dhuravala, DevIT Jitesh . Does Wednesday March 11th at 7:30am Vancouver time work?

Rob Wiesenberg, Contegra Rob , please let me know if that works for you as well

Thanks,

Morgan 
Rob Wiesenberg, Contegra
Hi Morgan Maguire, CEO Morgan ,

Unfortunately next Wednesday 3/11 we are not available at 7:30 ASM (Vancouver time. We are available for a call on Tuesday, Wednesday or Friday next week, after 10 AM your time. Thursday next week is wide open. We can have a call earlier in the day. Please let me know if one of those times works for you and Jitesh.

Thanks,
Rob 
Morgan Maguire, CEO
Ok. Sounds good, Rob Wiesenberg, Contegra Rob . Let's schedule the meeting at 8:00 on Thursday, March 12th.

I'll send a calendar invite with details.

Thanks,

Morgan
Rob Wiesenberg, Contegra
Morgan Maguire, CEO Morgan

Great, please let me know if you would like to use a call-in number or Skype. 

Thanks,
Rob
Morgan Maguire, CEO
Hi Rob Wiesenberg, Contegra Rob ​,

We'll be connecting through Zoom. The details are in the calendar invite I sent out earlier.

Thanks,

Morgan
Morgan Maguire, CEO
Hi Rob Wiesenberg, Contegra Rob ,

In preparation for the call on Thursday, Melissa Cowell, General Manager at Industrial Melissa has put together a document that outlines the requirements all of the searches that will be available on the subscriber side of the new ISLG application: https://docs.google.com/document/d/10TP4xS4YUgmnznIUI2pzzMA2HOZu1FPm8dE7zmgudtA/edit#heading=h.tuh7ytex0a7y.

The document is broken down into four categories of searches:
  1. Search
  2. Research Tools
  3. Document Library
  4. Other
For each search, Melissa Cowell, General Manager at Industrial Melissa has provided screenshots and links to the relevant wireframes. She has also pulled all the acceptance criteria from the relevant user stories. This should give you all the detail you'll need to understand the requirements of the various searches.

For the purposes dtSearch, the only search that performs searches of the document texts is the Full Text Search (and perhaps the Global Search via the Full Text Search), and thus I believe is the only search that will require dtSearch. dtSearch is also used for the Subject Navigator in the old application, because we integrated the Boolean and linguistic options available through dtSearch. However, it appears we discarded these requirements in the new application ( Melissa Cowell, General Manager at Industrial Melissa do you know if we did that purposefully or is this an oversight in the requirements?)

Let me know if you have any questions or concerns. I would be happy to hop on a call in advance of the call on Thursday to explain anything in the document, so that we can focus on how to optimize all the relevant searches.

Thanks,

Morgan
Jitesh Dhuravala, DevIT
Hi Morgan Maguire, CEO Morgan ,

We are know how existing dtSearch implemented and working in current application but we are not reached stage of new ISLG subscriber side which Melissa Cowell, General Manager at Industrial Melissa   has prepared document so we will discuss how to improve existing dtSearch working flow and implementation. We will expect more information from Rob Wiesenberg, Contegra Rob as per described by Melissa Cowell, General Manager at Industrial Melissa .


Thanks,
Jitesh
Morgan Maguire, CEO
Ok. Sounds good, Jitesh Dhuravala, DevIT Jitesh . The document above was meant for Rob Wiesenberg, Contegra Rob 's benefit more than yours, so that he can get context for our call tomorrow. We'll discuss more tomorrow.

Melissa Cowell, General Manager at Industrial Melissa , following up on my comment above concerning the Subject Navigator search, did we purposefully exclude the ability to perform Boolean searches and utilize linguistic functions (which we're currently using in the old application), because this will affect whether we use dtSearch in this search?

Thanks,

Morgan 
Melissa Cowell, General Manager at Industrial
Morgan Maguire, CEO Morgan  

We went back and forth on this a couple times. The intention was to simplify/standardize the search across research tools. That being said, we did end up modifying the behaviour for the Subject Navigator and integrating boolean and linguistic tools is certainly possible.

We could incorporate the following criteria:

I can enter a keyword search that is powered by the dtSearch search engine
  • From the SN search input, I will be able to type in a keyword, and run a search
    • Keyword search is performed by entering a term or terms and hitting [Search] to submit
    • I will be able to search using three different methods
    • I can select my keyword search method using a radio button:
      • I can select 'Any words':
        • Searches 'any words' typed into my search field 
          • An "any words" search request consists of an unstructured natural language or "plain English" query. In a natural language search request, words such as AND and OR are disregarded. Use quotation marks to indicate a phrase, + (plus) to indicate a word that must be present, and - (minus) to indicate a word that must not be present.
      • I can select 'All words': 
        • Searches 'all the words' typed into my search field
          • An "all words" search is like an "any words" search except that all of the words in the search request must be present for a document to be retrieved.
      • I can select 'Boolean' (default selection):
        • Searches keyword(s) entered into search field using 'boolean' logic
          • A boolean search request consists of a group of words, phrases or macros linked by search connectors such as AND and OR to precisely indicate the relationship between them.
    • I will be able to enhance my search by selecting various linguistic aids to use
      • I will be able to select 'Stemming'
        • This will search word and its modifications that have the same stem e.g. like, likely, likelihood
      • I will be able to select 'Synonyms' 
        • This will also search words that are synonyms of the search term eg. discrimination, bias, favouritism. 
      • I will be able to select 'Fuzzy typo' (default selection - 1 character)
        • This will include words that differ in their spelling from the search term by the number of letters selected, e.g favor/favour
        • I will be able to the number of letters to include in the fuzzy type by clicking from 1-10. This option will be disabled if the above check box has not been selected. 

Let me know your preference and this can be added to the user story and wireframes.

Mel
Morgan Maguire, CEO
OK. Sounds good. Thanks Melissa Cowell, General Manager at Industrial Melissa . I'll discuss with Rob Wiesenberg, Contegra Rob and let you know.

Morgan
Morgan Maguire, CEO
Hi Melissa Cowell, General Manager at Industrial Melissa ,

Following up on the above, given that we're offering these advanced features in the current application in Subject Navigator, I think we should have them available in the new application as well. Could you please update the applicable requirements in the user stories, and then we'll plan to integrate dtSearch in the searches for the Subject Navigator and the Full Text Search.

Thanks,

Morgan 
Melissa Cowell, General Manager at Industrial
Hi Morgan Maguire, CEO Morgan

Will do.

Harsh Parikh, Tech Lead at DevIT Harsh Ketan Sondarva, Technical Project Manager at DevIT Ketan please note, this will require minor edits to the Subject Navigator HTML templates.


Mel
Morgan Maguire, CEO
Hi everyone,

Following up on the call this morning with Ketan Sondarva, Technical Project Manager at DevIT Ketan , Jitesh Dhuravala, DevIT Jitesh , Harsh Parikh, Tech Lead at DevIT Harsh , Rob Wiesenberg, Contegra Rob and Radomir Mladenovic, Contegra Radomir , we've decided to address the search implementation issues as follows:
  1. Rob Wiesenberg, Contegra Rob and Radomir Mladenovic, Contegra Radomir will build the customized indexes necessary for all searches that utilize dtSearch, which will include the following
    1. Full Text Search
    2. Subject Navigator Search
    3. Document Library Searches
      1. Treaties & Rules
      2. Dispute Documents
  2. All remaining searches will be performed using SQL database search, which will include the following:
    1. Research Tools:
      1. Jurisprudence Citator search
      2. Article Citator search
      3. Publication Citator search
      4. Terms & Phrases search
    2. Research Notepad
    3. Document Comparison
  3. Ketan Sondarva, Technical Project Manager at DevIT Ketan , Jitesh Dhuravala, DevIT Jitesh and Harsh Parikh, Tech Lead at DevIT Harsh will assess the requirements  and projected performance of the SQL searches above, will report back if any of these searches will take more than 2 seconds to produce results, and then we will assess whether further customized indexes are required.
  4. Global Search will be performed by sequentially running searches across all the applicable searches, which will include SQL and dtSearch. However, if the performance of the Global Search is insufficient, we will explore the option of building a customized dtSearch index for the Global Search
As a first step, I need to confirm some paperwork with Rob Wiesenberg, Contegra Rob . When I confirm that is complete, Ketan Sondarva, Technical Project Manager at DevIT Ketan , could you please provide Radomir Mladenovic, Contegra Radomir and Rob Wiesenberg, Contegra Rob with the database views they requested over the call. As well, could you please provide them with remote access to the ISLG application server.

Lastly, for ILG, we are likely to adopt of a similar approach of creating a customized index for the dtSearch keyword search, but I would like us to finalize things in ISLG before we start that work. Therefore, Ketan Sondarva, Technical Project Manager at DevIT Ketan , Jitesh Dhuravala, DevIT Jitesh and Harsh Parikh, Tech Lead at DevIT Harsh , please defer all work concerning search features in ILG until we have developed things further in ISLG. However, if you find that deferring this work is disrupting progress in ILG, please let me know, and we will assess and adjust as necessary.

Please let me know if anyone has any questions or concerns.

Thanks,

Morgan 
Morgan Maguire, CEO
Hi Ketan Sondarva, Technical Project Manager at DevIT Ketan ,

Rob Wiesenberg, Contegra Rob and I have completed the necessary paperwork. Could you please provide Rob Wiesenberg, Contegra Rob and Radomir Mladenovic, Contegra Radomir with the necessary database view outline below and remote access to the ISLG application server.

Here is a summary of the database view needed:

Database views from which we can pull data and index. A database view is a searchable object in a database that is defined by a pre-defined db query. In this context each database view will represent the specific data fields that need to be indexed to support the searches for Full-Text Search, Document Library and Subject Navigator, respectively. Though a view does not store data, it can be thought of as a virtual table that can be queried like a table. A view may combine data from more than one table using joins, or just contain a subset of data needed for the purpose of searching a specific dataset. The data views should be created by the team that is already familiar with the data model otherwise considerable time would need to be spent for Contegra to understand the full data model.

Rob Wiesenberg, Contegra Rob and Radomir Mladenovic, Contegra Radomir can provide further detail as required.

Thanks,

Morgan
Ketan Sondarva, Technical Project Manager at DevIT
Hi Morgan,

We will provide database tables structure with fields and related Screen View to Contegra Team by end this week.
We have started working on Subject Navigator module.so, initially Contegra team can start working on Subject Navigator.
As we move along we will keep them updating for other modules as well.

Also, for database access do you need Contegra to have Live Database Server Access (Carbon 60) or local development server access (at DEVIT)? 

Thanks,
Ketan Sondarva
Morgan Maguire, CEO
Hi Ketan Sondarva, Technical Project Manager at DevIT Ketan ,

As discussed this morning, please provide Rob Wiesenberg, Contegra Rob and Radomir Mladenovic, Contegra Radomir with access to both the live database server (via Carbon60) and the local development server (via DevIT).

For the live server access, please request additional credentials from Carbon60.

Thanks,

Morgan
Morgan Maguire, CEO
Thanks Harsh Parikh, Tech Lead at DevIT Harsh . When possible, could you please provide similar documents for the Full Text Search and Document Library searches.

Rob Wiesenberg, Contegra Rob , please let us know whether the document above provides you with the information you need.

Thanks,

Morgan
Rob Wiesenberg, Contegra
Morgan Maguire, CEO Morgan , Harsh Parikh, Tech Lead at DevIT Harsh , thank you for the document. Radomir Mladenovic, Contegra Radomir  and I we will review and comment shortly. 
Harsh Parikh, Tech Lead at DevIT
Hi Rob Wiesenberg, Contegra Rob   and Radomir Mladenovic, Contegra Radomir ,

The above attached document is very basic document.  I know you should want to go deeply. 

Hence, Just go through that document and if you want further details then we will schedule one call to discuss further which things you want and then will provide you.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

Thanks for the document. Yes, it would be good to have a call and go through the details. For example, you list "DocumentValue" but it's not clear if that's a table or a column as I don't see it in the diagram.

Basically, what we need comes down to a couple of questions:
- Which columns are full-text searchable?
- Which additional columns do you need in search results? (e.g. ID field(s) that you use to create content URLs.)

We can schedule a call to go through it together. As it will be quite technical, I guess we can do it without bothering everyone else. Just drop me a note when you're available for a call. We could also use chat instead (e.g. Skype, Slack).

Thanks,
Radomir
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ​,

Sounds good to me. Can we do call on next tuesday 6:30 PM IST time??

Here, I have added my skype detail :

Skype id : harsh.parikh05

Jitesh and ketan will also join in this call.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

Great, I just sent you a contact invite by Skype. Talk to you on Tuesday.
Harsh Parikh, Tech Lead at DevIT
Hi Morgan Maguire, CEO Morgan ,

Today, We & Contegra Team taken one call to discuss Subject Navigator Search. We have one question for you.

Up to how many level you want to perform Search ? Do you want search in Dispute Document Full Citation and cited paragraph number ?
Morgan Maguire, CEO
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

Questions like this are answered in the document produced by Melissa Cowell, General Manager at Industrial Melissa here: https://docs.google.com/document/d/10TP4xS4YUgmnznIUI2pzzMA2HOZu1FPm8dE7zmgudtA/edit.

In the acceptance criteria for Subject Navigator, the keyword search will be performed for the following fields:
  • Branch Text fields for all Subject Navigator branches
  • For any Dispute Document associated with a Subject Navigator branch, the applicable:
    • Dispute fields: Respondent State, Case Name, Case Number and Special Search Terms
    • Dispute Document fields: Short Title and Full Citation
Therefore, it would include the Dispute Document Full Citator (and other fields related to the document), but not the cited paragraph number or text. Going forward, it's very important that you consult this document and the applicable user stories to understand the requirements of each search.

Thanks,

Morgan

cc: Radomir Mladenovic, Contegra Radomir Rob Wiesenberg, Contegra Rob  
Harsh Parikh, Tech Lead at DevIT
Hi Morgan Maguire, CEO Morgan ,

Thanks for clarification. 

We are taking "Special Search Term" field through Meta Field and other fields are required and hard coded.

Can we take Special Search Terms field as hard coded ?  Because only hard coded field we identified for to data model view.
Morgan Maguire, CEO
Melissa Cowell, General Manager at Industrial Melissa , what impact would hard coding the Special Search Terms field have on the master lists? Would this mean filing the field with data would be required, because it will be common for this field to be left blank?

Thanks,

Morgan
Melissa Cowell, General Manager at Industrial
Hi there Morgan Maguire, CEO Morgan ,

No, hard coded fields are not required. This would be fine.

Mel
Morgan Maguire, CEO
Ok. Great. Thanks Melissa Cowell, General Manager at Industrial Melissa .

Harsh Parikh, Tech Lead at DevIT Harsh , hard coding the Special Search Term field is fine.

Also, I assume this means that any field we integrate into a search will need to be hard coded?

Morgan
Harsh Parikh, Tech Lead at DevIT
Yes. You are right Morgan Maguire, CEO Morgan . The search field must be hard coded.
Morgan Maguire, CEO
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

Following up on our conversation this morning, it will not be possible for us to hard code all the fields that will be used to populate searches, particular for a number of the filter used in the Full Text Search and Document Library searches. Therefore, please review the search requirements in the search document: https://docs.google.com/document/d/10TP4xS4YUgmnznIUI2pzzMA2HOZu1FPm8dE7zmgudtA/edit, and let us know how you plan to deal with the situation. Note it's very important that we limit the number of hard coded fields, so that we can adjustment fields as required in the future.

Thanks,

Morgan
Harsh Parikh, Tech Lead at DevIT
Hi Rob Wiesenberg, Contegra Rob   and Radomir Mladenovic, Contegra Radomir ,

As discussed in last call, here, we have attached the spread sheet for Subject Navigator Search which contains all view columns (currently, it contains dummy data). Also, We have attached query which give us result set of columns.

Please check and let us know if you have any concern.

Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

In which database can I try this query? I was looking at SQL Server at 10.68.138.11 but couldn't find any that contains referenced tables.

I don't really understand the spreadsheet you sent but if the SQL you made returns data that should be indexed, that should be sufficient.

Thanks,
Radomir
Harsh Parikh, Tech Lead at DevIT
Hi Rob Wiesenberg, Contegra Rob ,

You can try above attached query on ISLGRebuild database on 10.68.138.11 server. 

When you fire attached query on ISLGRebuild database then you get the result of columns which we included in attached spread sheet.

Please note that the currently all available data are dummy.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,
I successfully executed query on ISLGRebuild. Looks fine to me. Could you please just create a view for it so we can do simple "select * from <view_name>" to get data?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We have cared view for Subject Navigator Search.

You can use Select * From vw_SubjectNavigatorSearch on ISLGRebuild database.
Radomir Mladenovic, Contegra
Thanks Harsh Parikh, Tech Lead at DevIT Harsh , I'll look into it and let you know if anything comes up.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

I need a small change to the view... We need a column that can be used as a "document" identifier in the index. As this view was created creating several tables, it's hard to tell what's unique.

Could you please re-create the view but making sure the first column is "id" - e.g. creating it as a string combination of column values and a delimiter. Something like:
 CONCAT(branchId, '/ ', ParentId, '/', documented) as ID, ... (continue with columns as in the current view)

I hope this makes sense. Please let me know if you have any questions.

Thanks,
Radomir
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir .

Sorry, I don't understand what you want. 

As per my assumption, you want column name which refer to document name in View.

Please clarify or we take short call  to discuss.

I am available on skype in all working days. (10:00  AM to 8:00 PM IST)
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I just need a column (number or string) that can be treated as an identifier of a row. dtSearch requires each document to have a unique identifier. We could use a random value but it's not ideal because prevents eventual use of incremental indexing (e.g we could not update a record as we don't know ID associated with the row).
The problem I have now is that I used the branchId (the first column) for dtSearch document id but, as this is not a unique value in the view, we overwrite records and at the end don't have all rows in the index.
So, I suggested creating an "artificial" ID that consists of all relevant row IDs the row consists of. And, as in the example I gave, to join these IDs together with some delimiter character.
Just put this column to be first in the view, that's how we'll know it's ID column.
If you need additional clarification, feel free to contact me on Skype.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Can I set Row_num() as unique ID for you as first column ? 
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , OK, let's go with Row_num().
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We have added Row_num() as unique Id in view. We have given alias Id and set as First Column.

Please check and confirm.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

Yes, the updated view looks good. I indexed it and the index is in "C:\Temp\test-index\subject-nav" on the Web server.

Now, do you want us to create a helper library to consume indexes or prefer to do it on your own? Would you prefer a Controller to access it as a web service, or a library (DLL)? Which C# framework version do you use?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Yes,  We want from you to consume indexes. We are using .Net Core 2.1 version and C# version is 7.0
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh  OK, but ho do you prefer to use it? As a library, so you call it from code, or as a Controller to call from JavaScript?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We are preferring Controller to call from JavaScript. Because, IF there is some customize we need then we can do easily.

This is my opinion. You can say your best approach.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

I prepared the first version of the search web service. It's on the web server in the "D:\TologixWebSearch". It's .Net Core 3.1 application.
Could you install this app on the webserver so I can finalize the setup?
To check if it's running, you can send POST /api/search/subject-nav with parameter searchRequest=branch for example.

BTW, I just now realized that you said you're using Core 2.1, not 3.1 - if that's a problem let me know and I'll downgrade code.

Thanks.
Harsh Parikh, Tech Lead at DevIT
Yes Radomir Mladenovic, Contegra Radomir . We are using .Net Core 2.1 Version.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I downgraded the search application to .Net Core 2.1. It's in the same folder, "D:\TologixWebSearch"
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , did you have a chance to install the search application?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ​ ,

I am not go through the search application. I am busy with to complete other stuff. I will go through by within one or two day and get back to you.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

I go through following path on ip server : 10.68.138.10 

E:\TologixWebSearch

I found that you put the published code in folder. Am I  Right ? 

Can we take call tomorrow 6:00 PM IST (Ahmedabad Time) ?
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , yes, that's the publish folder. 
Yes, we can talk Wednesday 6:00 PM IST
Harsh Parikh, Tech Lead at DevIT
Hi Morgan Maguire, CEO Morgan ,

We & Radomir took call for Subject Navigator search and all are going in good manner.

If we get the source code which radomir developed for SN as demo purpose then it is good for us. hence, we can test and get idea to do same thing in our local environment.

Radomir Mladenovic, Contegra Radomir , Could you please confirm for that ?
Morgan Maguire, CEO
Great Harsh Parikh, Tech Lead at DevIT Harsh . Look forward to seeing it when it's ready to be deployed as a demo.

Thanks,

Morgan
Harsh Parikh, Tech Lead at DevIT
Morgan Maguire, CEO Morgan , Could you confirm that Contegra team can provide their source code to us as demo purpose?
Morgan Maguire, CEO
Radomir Mladenovic, Contegra Radomir and Rob Wiesenberg, Contegra Rob , can you respond to Harsh Parikh, Tech Lead at DevIT Harsh 's inquiry above about the source code?

Thanks,

Morgan
Morgan Maguire, CEO
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

Rob Wiesenberg, Contegra Rob said the source code will be delivered when the entire project is complete. However, he said that you should be able to perform all the testing you need to do in the interim. 

Is there a reason you need the source code now? If you so, please specify why, and we can work something out.

Thanks,

Morgan
Harsh Parikh, Tech Lead at DevIT
Hi Morgan Maguire, CEO Morgan ​,

The source code just we need to check how custom dtSearch indexing working, how we can pass input parameter model  and how the data will get in json format.

We  need source code only one time. for example, radomir devloped web app for subject navigator. once we get all the idea then we don't want for rest of modules.

This is first module we are going to implement so we need to clear from our side that all are going in proper manner.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , as discussed, I've updated the service to accept JSON POST requests. Application is in the same folder (E:\TologixWebSearch) and sample index is in the C:\Temp\test-index\subject-nav

Example: POST /api/search/subject-nav
{
"searchRequest": "pink link",
"searchType": "Phrase"
}
or
{
"searchRequest": "pink link fooooo",
"searchType": "AnyWords"
}

Here's the object model from where you can see all available options:

public class SearchModel
{
    public string SearchRequest { set; get; }

    public int PageNum { set; get; }
    public int PageSize { set; get; }
    public bool Fuzzy { set; get; }
    public int Fuzziness { set; get; }
    public bool Stemming { set; get; }
    public bool WordNetSynonyms { set; get; }
    public bool Synonyms { set; get; }
    public bool PhonicSearching { set; get; }
    public SearchType SearchType { set; get; }
    public string SortField { set; get; }
    public string SortOrder { set; get; }

    public int SearchFlags { set; get; }

    public bool EnableDateSearch { set; get; }
    public DateTime? StartDate { set; get; }
    public DateTime? EndDate { set; get; }

    public string FileConditions { set; get; }

    public string BooleanConditions { set; get; }

    public bool IncludeSynopsis { set; get; }

    public int Near { set; get; }

    public bool ExcludeEnabled { set; get; }
    public string ExcludeTerm { set; get; }

    public SearchModel()
    {
        IncludeSynopsis = true;
        Stemming = true;
        Fuzziness = 4;
        SearchType = SearchType.AllWords;
        Near = 14;
        SearchRequest = null;
    }

public enum SearchType
{
    NoValue,
    AllWords,
    AnyWords,
    Boolean,
    Phrase,
    NearTerm
}

Sort order options:
    scoredesc
    scoreasc
    hitsdesc
    hitsasc
    locationasc
    locationdesc
    documentasc
    documentdesc

Let me know if you have any questions.
Morgan Maguire, CEO
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

I've got confirmation from Rob Wiesenberg, Contegra Rob that Radomir Mladenovic, Contegra Radomir will be providing you the source code for the Subject Navigator module in due course. 

Thanks,

Morgan
Harsh Parikh, Tech Lead at DevIT
OK Thanks Radomir Mladenovic, Contegra Radomir .

We will check and try it and let you know if we face any issue.
Morgan Maguire, CEO
Hi Harsh Parikh, Tech Lead at DevIT Harsh and Ketan Sondarva, Technical Project Manager at DevIT Ketan ,

I got a note from Rob Wiesenberg, Contegra Rob this morning asking whether we had any feedback on the customer indexer for the Subject Navigator. Could you provide an update on where we stand on this issue. Is the indexer performing as expected? Also, would it be possible for me and my team to get a preview on how it works?

Also, what are next steps in starting work on the next indexers (Full Text Search and Document Library searches: https://docs.google.com/document/d/10TP4xS4YUgmnznIUI2pzzMA2HOZu1FPm8dE7zmgudtA/edit)? Phase 2 of Subscriber side development includes the Disputes & Dispute Documents, which is scheduled to start development on May 21 and will include the Document Library searches. But I'm wondering if Rob Wiesenberg, Contegra Rob and Radomir Mladenovic, Contegra Radomir could get started on their work in advance to ensure the indexers are ready when the relevant development phases start?

Thanks,

Morgan
Harsh Parikh, Tech Lead at DevIT
Hi Morgan Maguire, CEO Morgan ,

We are planning to integrate Subject Navigator Indexes after 20th May (after completing the core development of phase 1).

For Dispute & Dispute Document Library, We will take in phase 2 development and taking priority to this first.

For Full Text Search, It is big module and will set priority in Phase 3.

Ketan Sondarva, Technical Project Manager at DevIT Ketan , will talk more closely with you in today's call.
Morgan Maguire, CEO
Ok. Thanks Harsh Parikh, Tech Lead at DevIT Harsh .

After speaking with Ketan Sondarva, Technical Project Manager at DevIT Ketan this morning, he'll speak to you and the team about coming up with a specific date on when Rob Wiesenberg, Contegra Rob and Radomir Mladenovic, Contegra Radomir can expect to start working on the next set of custom indexes. We'll discuss further during Thursday's call.

Thanks,

Morgan
Morgan Maguire, CEO
Hello everyone,

Further to discussions and emails with Rob Wiesenberg, Contegra Rob and Ketan Sondarva, Technical Project Manager at DevIT Ketan this week, it sounds like we are nearing the point of getting Rob Wiesenberg, Contegra Rob and Radomir Mladenovic, Contegra Radomir back involved in the project.

To start things off, Ketan Sondarva, Technical Project Manager at DevIT Ketan , could you please provide more details on the timeline for Rob Wiesenberg, Contegra Rob and Radomir Mladenovic, Contegra Radomir 's involvement, and what the steps and expectations will be over the next few weeks.

Thanks,

Morgan
Ketan Sondarva, Technical Project Manager at DevIT
Hi Rob, Radomir,

We will start integrating Subject Navigator Index provided by you on first week of Aug. Meantime we will share you SQL View of Dispute & Dispute document module by 5th Aug so your team can start working on index creation for the same. For FTS (Full Text Search) we are planning to send you SQL View by mid of Aug as till then we can finish our integration with Subject Navigator & working on Dispute & Dispute document. 

Also, I would like to know how much time it will take to create index for such module in general if we provide proper details of SQL View in a given time. So, we can plan our development & integration accordingly.

Thanks,
Ketan Sondarva
Radomir Mladenovic, Contegra
Hi Ketan,

It would be great if you could provide SQL Views as soon as possible so we can review and start working on it. It's hard to tell how much time it's needed for indexing - depends on the amount of data, the number of files to be index, system and database performance, etc. Even if you don't have all documents ready but have a decent amount, we can test with it and get some numbers from it.

Note that my availability in August is limited. I have ongoing projects but I can still dedicate some hours for your project in the first half of Aug. However, from Aug 15, I have scheduled vacation and on-site work planned so my availability will be very limited until the second week od September.

Thanks,
Radomir
Ketan Sondarva, Technical Project Manager at DevIT
Hi Rob,

Thanks for update and noted as well.

As we have some query in Dispute & Dispute document SQL View, if we can connect on 29th or 30th July, 2020 by 6:00 PM IST to discuss further on our query which will give you better idea about SQL View and you can also discuss any query from yourside.

Thanks,
Ketan Sondarva
Harsh Parikh, Tech Lead at DevIT
Hi Rob Wiesenberg, Contegra Rob and Radomir Mladenovic, Contegra Radomir ,

We have some quires for Dispute & Document SQL View. are you able take call on Skype tomorrow (30th July) at 6:00 PM IST?
Rob Wiesenberg, Contegra
That time works for me. Radomir Mladenovic, Contegra Radomir  are you available at that time?
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , unfortunately, I'm not available tomorrow at that time. Tomorrow I can do it 11:00 AM IST - it's too early for Rob but, as it's a technical call, I guess it's not necessary for him to participate. Rob Wiesenberg, Contegra Rob , is that ok with you?
Otherwise, I'm available on Friday (31st July) at 6:00 PM IST.
Rob Wiesenberg, Contegra
Radomir Mladenovic, Contegra Radomir / Harsh Parikh, Tech Lead at DevIT Harsh , I can meet at July 31 at 6:00 PM IST but if that is not convenient please go ahead and meet a convenient time without me. 
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I copied DB indexer to folder E:\Programs\TologixDBIndexer on 10.68.138.10. Talk to you tomorrow.
Harsh Parikh, Tech Lead at DevIT
Hi Rob Wiesenberg, Contegra Rob   and Radomir Mladenovic, Contegra Radomir ,

We have few queries regrading search result in dispute & document library module which we will discuss with Industrial team on Monday then will provide SQL view of Dispute & Document Library. 
Harsh Parikh, Tech Lead at DevIT
Hi Rob Wiesenberg, Contegra Rob and Radomir Mladenovic, Contegra Radomir ,

We are working on Subject Navigator Contegra Search Integration in our application and we are getting the search result through API call.

For next module Dispute & Document Library following I have added SQL View.

vw_DisputeDocumentLibrarySearch on ISLGRebuild databse.
(server :10.68.138.11)


But, I have some concern regrading search parameter.


As per above screenshot, there are so many search parameters we need to pass while we click on filter and find the result according to parameter which we selected.

Suppose for example, In text box we enter keyword "ICSID" and select the Language  "English" from search parameter then we need to get result from indexing who match the result with ICSID keyword  and English language.

So, my question is how the indexing will return the result by different  parameter which i selected ?

If you need to discuss then we are available on 13th August as tomorrow we have a national holiday.


Please suggest.
Radomir Mladenovic, Contegra
Hi Harsh,

You need to pass additional filters via Custom object, for example:

{
"searchRequest": "pink link",
"searchType": "AnyWords",
"custom": {
"language": "English"
}
}

You can see the implementation of this filtering in the SearchController.cs, lines 201-217.

Hope this helps. let me know if you have any questions.

Regards,
Radomir
Radomir Mladenovic, Contegra

Harsh Parikh, Tech Lead at DevIT Harsh , as discussed, I'm sending you an example attached how you can deserialize search engine response to a custom model. After deserialization, you can simply access fields as in: 
response.Results[0].Data.Filename
Hope this helps.
Harsh Parikh, Tech Lead at DevIT
Hi Rob Wiesenberg, Contegra Rob and Radomir Mladenovic, Contegra Radomir ,

As discussed on Friday (14/08)'s call, We have added HierarchicalParentIds column in Subject Navigator SQL view.

  • vw_SubjectNavigatorSearch
We have updated the SQL view on ISLGRebuild databse on  10.68.138.11 server.

We have remian ParentId column as it is and added new column HierarchicalParentIds to get multiple parentId in comma separator.

The HierarchicalParentIds column contains multiple ParentIDs with comma separator as per following screenshot.




Please let us know if you have any concern.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I updated indexer to make it run as a command-line tools as well.
You can get it from https://www.dropbox.com/s/fxd5acvnpxw7h6e/DBIndexer-with-cmd-line.zip?dl=0
If you run TlogixDBIndexer and pass a filename as an argument (which is a config file), it will run in the command line mode only. You can find sample config file "index-config-subject-nav.json" in the archive.

I'll look at the parents field soon.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I have one more indexer update for you:
https://www.dropbox.com/s/riuyasnjn9rv7ll/DBIndexer3.zip?dl=0
It includes 3 dtsearch data files that define some indexing behavior (e.g. noise words). The code was updated to load them from the application folder. (This change was needed to fix behavior where single letter terms were not indexed so, for example, document ID with value "1" could not be found.)

Next, I implemented support for finding and returning parent nodes listed in the HierarchicalParentIds column. The search response payload now includes parents field witch is the same list of documents. Sample response looks like this:

{
    "totalResults": 7,
    "results": [
        {
            "fields": {
                "id": "35",
                "branchid": "35",
                "branchtypeid": "3",
                "branchname": "Test new subject",
                "Filename": "db://vw_SubjectNavigatorSearch#Id=35",
                "parentid": "20",
                "documentid": "15",
                "contenttypedatamasterid": "29",
                "disputedocumentshorttitle": "OT/0001/03 - NJ Test Case AF-0005-01 Destination Code - 31/08/2020 - English",
                "shorttitle": "NJ Test Case AF-0005-01 Destination Code",
                "fullcitation": "NJ Test Case, AF-0005-01 Destination Code, 31 August 2020",
                "casename": "NJ Test Case"
            }
        },
...
    ],
    "parents": [
        {
            "fields": {
                "id": "27",
                "branchid": "27",
                "hierarchicalparentids": "1,27",
                "branchtypeid": "3",
                "branchname": "Abuse of Process",
                "Filename": "db://vw_SubjectNavigatorSearch#Id=27",
                "parentid": "1"
            }
        },
...
        {
            "fields": {
                "id": "1",
                "branchid": "1",
                "hierarchicalparentids": "1",
                "branchtypeid": "3",
                "branchname": "A",
                "Filename": "db://vw_SubjectNavigatorSearch#Id=1"
            }
        }
    ]
}
Note that the parent list contains only elements which are not present in the results list.

You can get search service update from:
https://www.dropbox.com/s/1y4e4tke94sqngn/TologixWebSearch.zip?dl=0
Harsh Parikh, Tech Lead at DevIT
Hi Rob Wiesenberg, Contegra Rob   and Radomir Mladenovic, Contegra Radomir ,

We need all results data including Parents and branch both in one array. We don't need different array for Parents data.

Could you please modify this and let us know. Also, the above dropbox link is not working.  
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh

I separated results and parents with reason. How would you know otherwise in which node there's a hit and which node is there only as a support? It's a one liner code on your end to join both arrays if you really need them that way. But, if you're sure you don't need separation between hits and non-hits, I can join them in the service. Please confirm.

BTW, which Dropbox link is not working? I tried both from the previous message and they both open.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

What is the purpose of separation of hits and non-hits which we don't know.

Our understanding is we need to pull all branch and parent data result in one array and we will convert that result as per our model and pass to presentation layer.

Could you please clarify what is the meaning and use of hits and non-hits separation ?

Also, the above Dropbox link is not open from our end. I have attached screen shot for your reference.

Radomir Mladenovic, Contegra
I thought you might need info where the actual hits are. If you don't care, I'll merge the arrays - it's easy to merge, not that easy to separate if you need hit info. I'll update the code and send you update later today.

As of Dropbox, check your network. Works fine for me.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We don't need to display hit count so just merge the array and let us know once you updated.

We will again use dropbox link and pull the service in our local environment.

One more question regarding above Examplecustommodel.zip file, Shall we have to create different result model for each tool ?

Suppose, above example model is work for SN but for DisputeDocuemntLibrary we need to create another model ? 
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

I made a change to the search service to add parents to the results list.
You can get it from https://www.dropbox.com/s/8va4x1q2jkzptls/TologixWebSearch2.zip?dl=0

As of the custom model, if for DisputeDocuemntLibrary or other index you have different fields, you will need a different model. I mentioned this as a downside of a custom model during our last call.
What you could do, is to make one model with all possible fields you have, across all different indexes you're going to create. It's messy but it's a single model.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir and Rob Wiesenberg, Contegra Rob ,

We are not getting all parent ids result from indexes. We looked in to your TologixWebSeacrh Application and found that you are using Ids in place of BranchIds.

Can we take call on Monday (31st August) 2:00 PM IST for Subject Navigator Indexes result ?
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I'm not available for a call on Monday. But do we need a call for this?
Please send me details what exactly is the issue and you'll have a fix by Monday.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Actually, You are using Id column instead of branch id for search Parent Ids and Parent Ids all result data.

You are using Id column in Line no. 112, 84. We have changed and set branchid instead and we get all results.

Please confirm that Is it OK ?
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,
So,if I understood correctly, you modified the view to use branchid in the id field and now it works fine? Sure, if you see no side effects, it's fine.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ​,

I was not modified the view. I have used branchid in place of id in line no. 112 and 84 in your tologixwebsearch application and then publish that code again and check.

it works fine.

The change we made in searchcontrol.cs file in line no. 112 and 84 is Ok?
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , sure that's fine. I made the same change on my side so we're in sync.
Harsh Parikh, Tech Lead at DevIT
OK  Thanks Radomir Mladenovic, Contegra Radomir ​. Also, we need to know that how we can highlite the search keyword using dtSearch.

Are you available to take call on tuesday ?
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh on Tuesday I'm available at 11:00 AM IST. Let me know if that works for you.
In general, we can add highlighting by extending the search service. Highlighted text could be added as a field to the JSON result object.

BTW, I saw your message on skype:

> we need to know how we generate indexes on daily basis whiout do manual process to generate indexes

I already sent you indexer update that runs from the command line. See my message and Dropbox download URL that I sent on Aug 17 above. You need to setup Windows Scheduler to run the indexer and that's it.
Harsh Parikh, Tech Lead at DevIT
Yaa Radomir Mladenovic, Contegra Radomir ​. we are available tuesday 11:00 AM IST. we saw your message for automated indexing but we are not able to do that.

We will discuss on tuesday.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Here, I have attached SearchController.CS file which we made changes for Subject Navigator Index..
Please replace the above file in your project.

As discussed, we need to highlight the match word in presentation view as we are currently did with Jquery Highlighter.

The word will be highlighted from following columns.
 
Respondent State, Case Name, Case Number and Special Search Terms, Short Title and Full Citation.

Also, Please let us know when we schedule a call to generate indexes through command line.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

As discussed per today's call, We need to provide 2 SQL Views. 1st SQL View contains all dynamic columns with column name (ex. Field_100, Field_101) and 2nd SQL View contains all columns which we need to bind in our model.

As discussed, You will get one column DisputeId from 1st SQL View and you need to pass DisputeId column value to 2nd SQL View and provide us JSON format result to bind data as per our model.


But, We have sent one mail to you regarding 1st SQL View. Due to Dynamic column structure, It is not feasible to create 1st SQL View. Hence , currently we are providing Stored Procedure in place of 1st SQL View which contains all dynamic columns.

We are able to genearte 2nd SQL View because that view contains Fixed column.

Here, I have provided you Stored Procedure and 2nd SQL View.

Stored Procedure name (In Place 1st SQL View) : FE_MetafieldwithValueDynamic

2nd SQL View :  (Pass the DisputeId column from which you get from above stored procedure)

VW_DisputeContegraSearch


You can use ISLGRebuild databse on 10.68.138.11 server to find the stored procedure and SQL view.

Please let us know that stored procedure is work from your side or not or we are ready to discuss over call.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

Which field identifies a "document" for this index? I think we mentioned ContentTypeDataMasterId in the past but now you reference the DisputeId. Should DisputeId be used as a document identifier?

I checked the stored proc result and the view. A possible problem I see is that for DisputeId=23 view VW_DisputeContegraSearch returns multiple rows. I can index data as received but in the results you will not be able to tell which field goes with which ContentTypeDataMasterId. (For example, you have Field_299=15532 for ContentTypeDataMasterId=58, but ContentTypeDataMasterId=59 has the same DisputeId so will be part of the same document.)
Is that how you wanted things to be indexed? Please confirm.

I'll be available on Skype in the following hour or two if you want to discuss it quickly.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Yes Consider DisputeId column for document identifier.

For 2nd Point, I will ping you on skype right now.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Thanks for quick call today. As discussed, We need to pass DisputeId column in 2nd SQL View and get results from indexing of all 2nd SQL View Columns.

Also, We get Search result count and highlight the matching word from all the columns.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

After closer examination of the data returned by the stored proc and the view, I still see some issues. 

The problem is that both return rows with duplicate DisputeId values. For example, multiple rows exist for both DisputeId 7, 30, etc. As we use DisputeId for a document identifier, any duplicate appearing in the first table  will overwrite the previously indexed document with the same DisputeId.

1) It's crucial that the first indexed table/view (here we use the result of the stored proc) returns rows that have unique document identifier. For the Disputes Search it makes sense that the DisputeId is used, but values should be distinct.

2) The second view indexed may return multiple rows for the same DisputeId and all columns will be indexed as a part of the same indexed document.

Let me know if you have any question.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We are not getting what you want to suggest. As discussed, We need to Pass DisputeId  column value from stored procedure data to 2nd SQL view for getting all Dispute and Document related data which we need to display in subscriber side.

If you want unique column then you can use ContentTypeDataMasterId column.

IF you want to take call then we are available now till 6:00 PM IST. Also, we are available on Monday 10:30 AM to 5:00 PM IST.

Please let us know.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

The problem is not indexing of the second view. The problem is with the first view (stored proc). Each row in the table returned by the stored procedure should correspond to one indexed document (and that will be returned in results as one matching document). 

My understanding is that your result item for this search is a Dispute. That means that you should have only one row per dispute in the first view indexed. As you have the same DisputeId in multiple rows, I suspect this is not the case.

BTW, I can index what you provided so far, and will can continue my work. However, my concern is that the data is not prepared as it should be which will result in an incomplete index and wrong/incomplete search output.

I hope this makes it more clear. Please let me know.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Yes.. You will get same DisputeId row. As discussed in last call, we are grouping our 2nd SQL View data in C# side. so when if get mutiple row data with same disputeid then will do grouping in C# side and then will present data for user.

Please let us know if you want to discuss over call.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

If we use DisputeId for index document identifier, you cannot get multiple rows - you would get only match for the last row as it will overwrite previous rows with the same id.
So, is using ContentTypeDataMasterId for the document ID fine?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

I think we need to take one short call to sort out this. Please let us know when you are available to discuss over call.
Harsh Parikh, Tech Lead at DevIT
Thanks Radomir Mladenovic, Contegra Radomir   for quick call. 

As discussed now, you can take ContentypeDataMasterID column as unique column and by using this column you will get value of DisputeId column and pass this DisputeId value in 2nd SQL View.

Hope this is fine.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

I'm still working on the search service changes - you should have it tomorrow. In the meantime, here's the JSON structure that you can use to send your query:

1) Boolean queries nesting:

{
	"type": "boolean",
	"operator": "and"
	"clauses": [
		{
			"type": "boolean",
			"operator": "or"
			"clauses": [
				...
				another boolean or match type
			]
		}

	]
}

2) Field value matching:

{
	"type": "match",
	"exclude": false,
	"field": "Field_100",
	"value": "3"
}
or with multiple values:
{
	"type": "match",
	"exclude": false,
	"field": "Field_100",
	"values": ["3", "7", "14"],
        "operator": "and"
}
Obviously, the "exclude" field is if you want to exclude documents having field values. The "operator" tells how to combine multiple values - are they combined using OR or AND.

I hope this covers all your cases. Let me know if anything is missing.
Harsh Parikh, Tech Lead at DevIT
OK Thanks Radomir Mladenovic, Contegra Radomir . Can we take call on Wednesday between 10:30 to 5:00 PM IST to discuss JSON structure. meantime, we can discuss internally and then we both discuss on Wednesday.

Please confirm your connivance time.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh we can talk Wednesday 11:00 AM IST.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh

Here's an update for you: https://www.dropbox.com/sh/iufr961llu2wrht/AADzxZAmXEcDn-hotEyY3CkNa?dl=0

1) DBIndexer is updated to index second view. Check the indexer-config-disputes.json config file in the zip file - you probably just need to change the index path where you want tosave the index.

2) TologixWebSearch - copy CS files to update the existing project files. In the appsettings.json you can see that new parameter was added ("DisputesIndex") that should point to the Disputes index,

Search service endpoint for Dispatch search is: /api/search/disputes

I'm attaching example request body request and response.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , did you have time to try Disputes indexing and the search service?
I've noticed that, because of multiple rows in the 2nd view, some fields in results have multiple values. I made changes to the indexer (attached modified file) to prevent that. I'm also sending you a sample output for the same query as in the previous post.

Talk to you tomorrow.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ​,

We will look into this today and will update you.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

As discussed in today's call, We need only 2nd View indexing result to bind the data in model.

The stored procedure we used just to get the DisputeId from filtered data and pass that DisputeId value to 2nd View. so we need 2nd view data which contains all columns which we need to bind in our model. 

Hope this is fine and let us know once you changed the service.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh  

I made changes in accordance with today's talk. You can find modified files at
https://www.dropbox.com/sh/htv0bhk4sea6n0p/AAB7HzodCDz1N6aKf0W-1NGga?dl=0

Note that the dispute index config has changed as well.

To the appsettings.json new parameter was added for DisputeDocsIndex

You can see sample results output JSON in disputes-response-sep17.txt

Hope that's what you need. Let me know if you have any question.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We have following operators which we need to include service while we pass filter data in json file.

  • is set
  • is not set
  • is equal to
  • is not equal to
  • is after
  • is before
  • is between
  • is not between
  • starts before
  • starts after
  • ends before
  • ends after
  • lasts more than
  • lasts less than
  • is greater than
  • is less than
Let us know once you updated service. Also, Provide example of JSON file which contains all above operators.
Radomir Mladenovic, Contegra
Hi @Harsh,

All these operators do not exist in dtSearch as is. You need to transform them to a combination of "equals" and boolean statements.

For example, for "is set" and "is not set", you could add an index field (column to your view) as "field_59_set" with value "Y" (when the value is set) and "N" (when the value is not set). Then, if you need "field_59 is not set", send search for "field_59_set = N".

As for operators after/before/between, I guess you need them only for dates, correct? First, you need to change the formatting of dates in your view to the "YYYYMMDD" format. The way dtsearch date range query filter like is: xfilter(word "datefield::20020101~~20020131")
https://support.dtsearch.com/webhelp/dtsearchCppApi/File_Conditions.html

Filters containing starts/ends/lasts probably refer to a specific field. You will need to specify that field in the query, combining with supported operators.

Operators "greater than" and "less than" do not exist as such. What are you comparing? What's the type of the field this applies to?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

I think we need to discuss above points on call. Also, we have following queries in JSON file which pass to filter Dispute & Document Data.

We have following 2 question for JSON request.
1) If we not pass any operator in JSON then API give us error of invalid operator
2) we need to confirm the JSON file for Add Another Rule from your side.

We are available to discuss above all points 10:30 AM to 6:00 PM IST. 

Please let us know your convenient time to discuss.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

As you need all details, Here We have attached ZIP file which contains 4 screen shot with different kind of filter data we set.

And based on this filtered data we make json file which you can see in ZIP folder.

Please look into that JSON file and provide your feedback that as per filtered data the JSON file is correct or not ?



As you mentioned, will talk more on Thursday call 11:00 AM IST.
Radomir Mladenovic, Contegra
Hi Harsh,

1)

{
"type":"match",
"field":"Field_62",
"Operator":"and", <<<<< this means both 234 AND 233 should be present, if that's what you wanted, it's fine
"values":["234","233"]
},

2)

{
"type":"match",
"exclude":false,
"field":"Field_347",
"values":[true], <<<< as you have a single field here, you can use the "value" field and then you don't need the Operator here. It would be ignored anyway. (Check the support JSON examples I sent you earlier.)
"Operator":"and"
},

3) The query part with Field_91 and Field_94 doesn't look good. From other items in your JSON I think you didn't fully understand the logic of the structure.

For type:boolean, the operator is used to combine clauses. For example:

{
"type": "boolean",
"operator": "and"
"clauses": [ c1, c2, c3 ]
}

will be translated to c1 and c2 and c3 where c1/c2/c3 are boolean or match elements.

According to my understanding of your UI and what you told me, JSON for this part would looks like:

{
"type":"boolean",
"Operator":"or", <<< you combine with OR the same queries for Field_91 and Field_94
"clauses":[
{ <<< a clause for Field_91
"type":"boolean",
"Operator":"or", <<< this is OR selected in your UI
"clauses":[
{
"type":"match",
"exclude":false,
"field":"Field_91",
"values":["23447","23573"],
"Operator":"and"
},
{
"type":"match",
"exclude":true,
"field":"Field_91",
"value":"23411"
}
]
},
{ <<< a clause for Field_94
"type":"boolean",
"Operator":"or", <<< this is OR selected in your UI
"clauses":[
{
"type":"match",
"exclude":false,
"field":"Field_94",
"values":["23447","23573"],
"Operator":"and"
},
{
"type":"match",
"exclude":true,
"field":"Field_94",
"value":"23411"
}
]
}
]
}

Hope this helps.
Radomir Mladenovic, Contegra
Hi Harsh,

I made changes in accordance with our today's call.
You can find updated files at: https://www.dropbox.com/sh/wvdyic79bqey4i4/AADZen55gjB-nhPwqDRG30MVa?dl=0

1) Indexer was updated to index date fields in YYYYMMDD format. Other date fields in the view that you send as text, you need to fix on your end.

2) Search service changes:

a) Instead of using QueryStatement, send JSON via field FilterStatement. The JSON structure remains the same, with additions for date search below.

b) To search for a date range use object of the "range" type:

{
"type": "range",
"field": "datecreated",
"from": "20190919",
"to": "20210919"
}

For the BEFORE operation, also use type "range" but specify only "to" (omit the "from" field). For the AFTER operation, use "range" and "from".

c) The operator field is now optional and the default is OR as you requested.

d) To sort results use SortField with name of the field on which you sort, and SortOrder, which can have value "asc" or "desc".
Note that the sorting will be used on the "disputes" index search, and then you're getting results from the "dispute-docs" for already ordered disputes. I hope that makes sense.

Below is an example of a json structure with the new features:

{
"searchRequest": "test",
"FilterStatement": {
"type": "boolean",
"operator": "or",
"clauses": [
{
"type": "match",
"field": "datecreated",
"value": "20201001"
},
{
"type": "range",
"field": "datecreated",
"from": "20190919",
"to": "20210919"
},
{
"type": "range",
"field": "datemodified",
"to": "20190110"
}
]
},
"SortField": "language",
"SortOrder": "desc"
}
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ​,

Can we take call to discuss full text search module requirements and how we get data from contegra search?

We will take call on 6th october 11:00 AM IST.

Please confirm.
Radomir Mladenovic, Contegra
Hi Harsh,

Yes, we can talk 6th october 11:00 AM IST.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir   and Rob Wiesenberg, Contegra Rob ,

We are working on Full Text Search Module. Due to one technical Challenges task, We are delay in this module.

We are ready with SQL View and requirements for FTS module.

Could we take call on next Tuesday (20th October) 11:00 AM IST to discuss and finalized the FTS module ?

We will take call on our Skype Group.

Please confirm.

 
Radomir Mladenovic, Contegra
Hi Hars, Tuesday (20th October) 11:00 AM IST works for me.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

As discussed today for FTS module, Following I have added details.

1) We are provide you Query and View. Query will use for Find the ContentTypeDataMasterId based on Filter we passed. then, We are Passing ContentTypeDataMasterId to our view to display card detail. The View is our Final Result.

2) The SQL View contains HtmlFileName, PDFFileName and ISPDFOnly Column.
ISPDFOnly : 0 -> Need to find search keyword in html files
ISPDFOnly : 1 -> Need to find search keyword in PDF files.

Server Path for PDF and HTML Files :  (Server ip : 10.68.138.10)(E:\ISLGRebuildDemo\wwwroot\Documents)

3) We need to highlight the keyword in Paragraph or Page text. Also, we need hit count.

4) We have sorting and Pagination to on display cards. By default we need to display 10 cards.

Here, I have attached sample PDF file, HTML file, Query and SQL View.



Database Name : ISLGRebuild (Server : 10.68.138.11)
SQL Query Name :  FE_MetafieldwithValueDynamicFTS
SQL View Name : VW_FTSDocumentSearch


Also, As discussed Please take following SearchControl.cs file and put in your project and use this page for further work.

Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ​,

As discussed today, please let us know once you resolved regenerate indexing issue.

Also, please let us update on Full Text Search module? when we expect for this?
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I hope to have Full Text Search early next week.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

What will do the file yo provided in following link for re-indexing issue.
https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=gfh2JQ

We are not getting that what will do with those file ? How we can resolve this ?
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh not sure what you mean. In the folder "2020-10-29 DB Indexer update" you have compiled indexer and modified sources.
The indexer checks if the index already exists and turns off "Create" flag. That was the problem that prevented indexing when the index was in use. I reproduced the problem locally and it was fixed by his change.
Hope it works for you.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

What we need to do with those 3 files in folder ? Can we replace in TologixDBIndexer Project ?

IF yes, then if we replace those files then it gives error that Dthelper is not exists.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , maybe you didn't copy properly. There's DtHelpers class in the SampleDataSource.cs so not sure why would you see the error.
In any case, you can use the already compiled exe that I provided.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

As per your comment, We have also copied SampleDataSource.cs file in TologixDBIndexer Project. The error is gone and Indexing is generated.

But, Sam problem is still occurred. We need to stop published application in IIS and then again need to generate indexes. after that result will produce. If we directly try to generate index then no result is found.

Please do needful. We are available in skype till 6:00 PM IST to discuss.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , let's talk tomorrow 11:30 IST about the indexing issue. I reproduced the issue you had and the new indexer worked fine for me after the fix.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I started indexing FE_MetafieldwithValueDynamicFTS and noticed one issue - values ContentTypeDataMasterId is not unique in this query. Which column identifies the document/row?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We are available to discuss FTS module and re-indexing issue. Can we conenct ?
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , as discussed today:

1. I need you to add a unique ID to the indexed query/view. It can be row number as you suggested, but it that case you will not be able to do incremental indexing (as there's no reliable unique identifier of a document). As you said full index is generated daily, this might not be a problem so row number is fine.

2. If we generate two separate indexes for the query and the view, we might hit dtSearch limitation on the request size (I remember it was 64KB) when passing results IDs from the first index to the next one. That's why I highly recommend either:
a) Adding all data to one stored procedure as you suggested, or
b) Adding ContentTypeDataMasterId parameter to the FE_MetafieldwithValueDynamicFTS procedure, so that we can index the view first and get only related data from the proc.
That will allow us to create single index with all data.

3. dtSearch doesn't support text extraction (and highlighting) on page level. I'll think about how to meet your requirements for getting result pages and discuss this with Rob Wiesenberg, Contegra Rob .
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Following, I have added discussed point as per todays call.

1) Re-indexing issue is resolved
2) As per your recommended, We have merged query and view in one Single Query for all data. so now there is no need of second view. Following, I have mentioned query name.
3) We have added Row number in query as unique identifier. (Column name RowId)

The Stored Procedure Name : FE_MetafieldwithValueDynamicFTS. You can find on ISLGRebuild databse on our server.

Now, There is no need of second view. We get all data from above mentioned query.

For Sorting, There are  following 4 field you can use from query.

  • Relevance (default) - relevancy criteria based on current functionality of "hit count"  - Depended on hit count from your side
  • Document Name (A-Z) - Column name (FullCitationText)
  • Newest First - Column name (SortingDate)(We have provided this column in  yyyymmdd format)
  • Oldest First - Column name (SortingDate)(We have provided this column in  yyyymmdd format)

As we discussed, When column ISPDF Only 1 in query, it mean we need to display Page number from PDF file and when user click on it we need to render that page under paragraph and highlight the serach word.

For Html, We can display Paragraph from html file and highlight the  search keyword in that Paragraph.

Also, We have paging in this module. 

Hope this fine and you get all things from our side.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

In the https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=gfh2JQ
you can find a new folder with FTS indexer and the web service update.

Indexer config file indexer-config-fts.json specifies paths to two different indexes: 
- IndexDir (as before) is the main documents index
- ParagraphIndexDir is for the index containing paragraph-level data.
Both indexes are created in one run.

Both indexes should be added to the web service appsettings.json as in:
    "Tologix": {
      "FullTextDocsIndex": "y:\\contegra\\tologix\\test-index\\tologix-index-fts\\",
      "FullTextParasIndex": "y:\\contegra\\tologix\\test-index\\tologix-index-fts-paras\\",
      "SubjectNavigatorIndex": "y:\\contegra\\tologix\\test-index\\subject-nav\\",
      "DisputesIndex": "y:\\contegra\\tologix\\test-index\\tologix-index-disputes\\",
      "DisputeDocsIndex": "y:\\contegra\\tologix\\test-index\\tologix-index-dispute-docs\\"
    },

The web service endpoint /api/search/fts was added for full text search requests. It's similar to what you had before, with additional paragraphs field containing a list of matching paragraphs.
See file full-text-search-example.txt for request and response payload example.

When the paragraphs is null, means that the matches are not in paragraph contents but in some other field.

Currently, for documents with ispdfonly=True paragraph number has null. This is because we're still not indexing separate pages in PDF documents so cannot provide appropriate information for this. We're still looking into finding a solution for this.

The web service endpoint /api/search/highlight-para is handling paragraph highlighting request. You need to pass an object with paraId (paragraph identifier from the search results), plus searchRequest and any applicable search control options. See BasicSearchParams class in the sources for all available options.
See file full-text-search-paragraph-highlighting-example.txt for request and response payload example.

Let me know if you have any question.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We are doing integrate FTS contegra search in our application and We are facing issue to get the data form indexing. It throws error so we need to discuss this thing over call.

Also, We need to understand whole procedure how to get data as well paragraph list from indexing.

We are stuck here for further development.

We are available till 6:30 PM IST for all weekdays. Please ping to our Skype group to discuss this thing.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Thanks for quick call.

As discussed, Currently, we are getting error while we are going to fetch result data from API call. Following I have added details.

at System.ThrowHelper.ThrowAddingDuplicateWithKeyArgumentException[T](T key)
   at System.Collections.Generic.Dictionary`2.TryInsert(TKey key, TValue value, InsertionBehavior behavior)
   at System.Collections.Generic.Dictionary`2.Add(TKey key, TValue value)
   at TologixWebSearch.Controllers.SearchController.FullTextSearch(SearchModel sm) in D:\shrinivas\SVN_Project\TologicWebSearch\Controllers\SearchController.cs:line 65
   at Microsoft.AspNetCore.Mvc.Internal.ActionMethodExecutor.SyncActionResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments)
   at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.<InvokeActionMethodAsync>d__12.MoveNext()


Also, We need to highlight the any search word in Subject Navigator module from any field.

Hence, please let us know once you complete this.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , you can find fix for the FTS in the "2020-11-11 FTS update" folder on OneDrive.
I'll let you know when I have a solution for fields highlighting.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Due to one bad news my team member Shrinivas' s  Father has been passed away last week. hence, he start working on this by this week.

We are getting one error while we fetching paragraph text from highlitpara method.

Can we take call to look  into this issue by next Monday in-between 10 :00 AM to 6:00 PM IST ?
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh
Sorry to hear that. Yes, we can talk on Monday. I'll contact you on Skype as soon as I'm available in the morning (around 13:00 IST, that should be my 08:30).
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh in the OneDrive folder you can find web service update for fields highlighting. The result object will contain map "highlightedFields" with highlighted content when a match was found.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Thanks for today call.

Does this API search the word from PDF file and provide the PDF page number and if we click on page number then it will display whole page text with highlighted search keyword ?

Could you please confirm ?
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I discussed in Skype, the current FTS search returns matches for PDF documents but doesn't offer page hits. It's because dtSearch doesn't allow us to get PDF text for a particular page. We need to extend the custom indexer to support this but as a part of this need a solution for splitting PDF into pages.
My understanding is that Tologix has PDF Highlighter license but some older version. I'm waiting on feedback from Rob Wiesenberg, Contegra Rob   and Morgan Maguire, CEO Morgan   about this. If Tologix upgrades to the latest Highlighter Pro Edition, we can use it for text extraction on the page level.
If the upgrade doesn't happen, we'll need to research further and find some other solution for it.
Harsh Parikh, Tech Lead at DevIT
OK Thanks Radomir Mladenovic, Contegra Radomir ​ for confirmation. currently we are going with html files and let us know once you get any confirmation or solution for pdf files.
Morgan Maguire, CEO
Hi Harsh Parikh, Tech Lead at DevIT Harsh ​, Radomir Mladenovic, Contegra Radomir ​ and Rob Wiesenberg, Contegra Rob ​,


I spoke to Rob Wiesenberg, Contegra Rob ​ about this issue a few weeks ago, and told him to implement whatever solution that will provide us with the requirements we need for the PDF search results, including an upgraded PDF Highlighter license if required. So please implement your recommended solution as soon as possible, because this feature release is now several weeks past due.

Thanks,

Morgan
Radomir Mladenovic, Contegra
Hi Morgan Maguire, CEO Morgan , sounds good. I'll start with PDF Highlighter integration and hope to have updated indexer in a day or two.

I have one technical question, not sure who can answer on this: Can I rely on Highlighter server having access to the network share with the documents being indexed? I can upload PDF to Highlighter wherever file is indexed but with Highlighter having access to the file share we can save time and bandwidth.

Thanks,
Radomir
Morgan Maguire, CEO
Hi Radomir Mladenovic, Contegra Radomir ,

I have no problem setting things up that way. As long as access to the network share is done securely and does not result in compromising the security of any data on our servers.

Harsh Parikh, Tech Lead at DevIT Harsh or Jitesh Dhuravala, DevIT Jitesh , can you answer the question from a technical feasibility perspective, or do we need to talk to Carbon60 (our server hosting provider) about this?

Also, Rob Wiesenberg, Contegra Rob , I assume we'll need to setup a new MSA to accommodate this setup?

Thanks,

Morgan
 
Harsh Parikh, Tech Lead at DevIT
Hi Morgan Maguire, CEO Morgan and Radomir Mladenovic, Contegra Radomir ,

As Morgan suggested if Network share securely then I also don't see any problem setting on that way. 
Morgan Maguire, CEO
Ok. Great. Thanks Harsh Parikh, Tech Lead at DevIT Harsh .

Radomir Mladenovic, Contegra Radomir , please proceed, unless Rob Wiesenberg, Contegra Rob , you think we should have a quick call to discuss?

Morgan
Rob Wiesenberg, Contegra
Morgan Maguire, CEO Morgan , the approach seems fine. Do you expect to continue to use the ISLG legacy  product once the new version launches? The reason I ask is that we will need to determine if one than one license to PDF Highlighter is needed. Radomir Mladenovic, Contegra Radomir  can we serve both applications from the server where Highlighter will be running?
Morgan Maguire, CEO
Hi Rob Wiesenberg, Contegra Rob ,

Yes, we'll be maintaining the legacy application for a beta period (3-4 month). During that period, both applications will be operating in parallel to each other.

Thanks,

Morgan 
Radomir Mladenovic, Contegra
One Highlighter instance can serve both applications. We'll just need to upgrade the installation.
I can install trial version on the dev server for use with the current test content folders, and we can upgrade the production instance later.
Radomir Mladenovic, Contegra
Morgan Maguire, CEO Morgan Harsh Parikh, Tech Lead at DevIT Harsh I have PDF page indexer ready but need to install or update PDF Highlighter. It looks like the server 10.68.138.10 is running production Highlighter, correct?
I don't have access to other server (except the SQL Server) on which I could install Highlighter for development/testing. Do you want me to upgrade the production instance instead? There will be some downtime though (15-130 minutes until I migrate the config). Please let me know.
Radomir Mladenovic, Contegra
I made a typo... 15-30min
Morgan Maguire, CEO
Hi Radomir Mladenovic, Contegra Radomir ,

Harsh Parikh, Tech Lead at DevIT Harsh or Jitesh Dhuravala, DevIT Jitesh should probably confirm, but yes, the application run on server 10.68.138.10; however, this server is used for both the production (https://www.investorstatelawguide.com/) and development (https://dev.investorstatelawguide.com/) environments. Does that mean we're using the same instance of PDF highlighter for both environments? 

Note if it's going to require downtime on the server that will affect the production environment, we should perform the install at between 2:00am and 3:00am Eastern Time (North America) on a Friday or Saturday evening to minimize disruption to users.

Thanks,

Morgan
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The all ISLG applications are running on server 10.68.138.10. The development environment application is (https://www.investorstatelawguide.com/)  and Production environment (https://www.investorstatelawguide.com/).

You can perform the highlighter upgradation between 2 to 3 AM Eastern Time (North America) on a Friday (it mean tomorrow 27th November).

Please make sure it will be not affected the production application https://www.investorstatelawguide.com/)

Also, Let us know once you completed the upgradation and server will be restarted so we can check all running application from server 10.68.138.10
Morgan Maguire, CEO
Hi Radomir Mladenovic, Contegra Radomir ,

I discussed the issue above with Harsh Parikh, Tech Lead at DevIT Harsh earlier today, assuming that you agree that this risk to production environment FTS is low (i.e., that making the updates to the PDF Hit Highlighter software will not affect the FTS in the subscriber side of https://www.investorstatelawguide.com/), please proceed with making the updates on server 10.68.138.10 during the window on Friday or Saturday this weekend.

Please confirm when the updates will occur, and Harsh Parikh, Tech Lead at DevIT Harsh and Ketan Sondarva, Technical Project Manager at DevIT Ketan will ensure that someone will be available to test the application after the updates are made to ensure there are no issues.

Thanks,

Morgan
Radomir Mladenovic, Contegra
Hi Morgan Maguire, CEO Morgan , Harsh Parikh, Tech Lead at DevIT Harsh ,

I can do the upgrade during my Saturday morning (which should be around 2am your time). I'll send a message to Harsh Parikh, Tech Lead at DevIT Harsh   and Ketan Sondarva, Technical Project Manager at DevIT Ketan so they make sure the production is working as expected.

FYI, I'll be on a road from Saturday afternoon to Sunday evening so during that time will not be available. If you think it's too risky to to upgrade before my trip, we can leave it for my Monday morning (which should be around 2am in US and still gives Harsh and the team time to verify the installation and we have enough time for any corrections before US work hours).

Please let me know what do you prefer. 

Thanks,
Radomir
Morgan Maguire, CEO
Ok. Sounds good, Radomir Mladenovic, Contegra Radomir . Let's proceed with the upgrade on Saturday at 2am. I will send out a calendar invite to remind everyone.

Harsh Parikh, Tech Lead at DevIT Harsh and Ketan Sondarva, Technical Project Manager at DevIT Ketan , please take note that someone will need to be available to check the application (including the FTS on  https://www.investorstatelawguide.com/) is functioning as required on Saturday between 12:30pm and 1:30pm Ahmedabad time.

Thanks,

Morgan
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ​,

Please upadte us once you complete the process on server so we will check all aaplications.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , PDF Highlighter has been upgraded. Please test and let me know if you notice any issue.
Harsh Parikh, Tech Lead at DevIT
Ok Radomir Mladenovic, Contegra Radomir ​. I will check and provide the feedback.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ​ and Morgan Maguire, CEO Morgan ​,

I have checked all application as well FTS module in both investorstatelawguide.com and dev.investorstatelawguide.com and all are working fine. Also, the highliter is working in FTS module.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I'm trying to test new FTS indexer but looks like there's no content currently in test database where IsPdfOnly=True. Could you please add some sample PDF content?
Morgan Maguire, CEO
Great. Thank you Harsh Parikh, Tech Lead at DevIT Harsh ​ and Radomir Mladenovic, Contegra Radomir ​. I've tested th the FTS as well, and everything appears to be working normally.

Yes, Radomir Mladenovic, Contegra Radomir ​. There are no documents on rebuild.islg; however, there should be some sample content on rebuilddemo.islg.

Thanks,

Morgan
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We have uploaded test documents for ISPDFOnly=True on our database ISLGRebuild on server 10.68.138.11.

Also. The PDF Files have stored in E drive on server (10.68.138.10) on Following Path.

E: Drive : ISLGRebuildDemo\wwwroot\Documents\PDFFiles
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Currently, 2 things are pending in FTS module from your side.

1) When we click on any paragraph from html file then we are displaying html parapgh text below the paragraph but, we also need to highlight search keyword inside that paragraph which is not currently working. 

2) We need to display Page number when ISPDFOnly = True and when user click on Page number we need to display whole page text and highlight the search keyword

Can we take call tomorrow on skype in-between 10:00 AM to 6:00 PM IST ? 

Please confirm
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

Paragraph extraction was implemented for the HTML file format you send me and highlighting for that worked fine on my end. This HTML had:
  • <div> elements with class "paraBullet" containing paragraph number, and
  • elements with class "ParaText" containing paragraph text.
However, you've added HTML files in a different format. Please check file "OTI-0004 - Vienna Convention on the Law Treaties-1.html" for example. The above mentioned classes are used in a different way and paragraphs cannot be extracted using the same logic. 

Indexer was breaking because it could not parse HTML files in unsupported format. I believe that's the reason why paragraph highlighting didn't work for you. I made a change to the indexer so that invalid file is skipped and error logged. Now, at least valid HTMLs will be handled.
I also managed to test PDF page extraction and highlighting so that's working as well.

In order to extract paragraphs from HTML we need consistency in the HTML formatting. Please, let me know which format you're going to use. If there are multiple different formats, we need to support them all. It would be great if you could provide us with all format details. If you don't have this information, I'm afraid we'll have to do it one by one, analyzing error logs and the actual content.

In the OneDrive folder you can find updated indexer and FTS indexing configuration. Notice new config parameters PdfHighlighterUrl which is required for PDF page extraction.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ​,

We are looking into above things. meanwhile we should take call tomorrow between 1 PM to 6 PM IST. so we both will remian on same page.

we also need to discuss regrding html format.

Please confirm for tomorrows call.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , let's talk tomorrow (Wednesday) at 1:30PM IST.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

After today's call, we have started looking into all functionalities which we devloped for FTS.

1) Data Binding
2) Pragraph Listing
3) Sorting
4) Pagaing of Document Card

But somehow, The paging of Document card is not working. When we click on 2nd page then same data of 1st page is rendering.

had we missed something for paging ?  Can we take call tomorrow again 1:30 PM IST to resolve this issue ?

Please confirm.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,  I'm sending you a fix for the pagination. Note that pagination was enabled now for Disputes and Subject Navigation as well. If this is not desired for these collections, you can change the last parameter in call to CreateResponse to include all results. In that case, let me know to update my copy as well.
Harsh Parikh, Tech Lead at DevIT
OK Radomir Mladenovic, Contegra Radomir   Thanks. Will implement this thing and will let you know if we find any issue or error.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

How we can set Rebuild Project Document Path in highlighter folder ? Currently in application.conf the live existing path is already set. 

How we can set Rebuild Project document path also on server 10.68.138.10 and which highlighter URL will use ?

Please give answer as early as possible as we have release by tomorrow for UAT.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

First, do you want to use the same Highlighter instance or not?
What you could do is installing another Highlighter instance on a different path and different port.
If using the same instance, you could use a different path prefix for the Rebuild Project and in folder mapping settings point Highlighter to a different folder for that prefix.

If you're using the same instance, you need to use the same Highlighter URL. Otherwise, you need to setup another proxy on IIS to point to another instance's port.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ​,

Can we take short call on tomorrow or monday between 10 AM to 6 PM IST? As we are not as much aware about highliter setup on server and we dont want to take any risk for current live islg application.

Please confirm.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , let's talk tomorrow 1:30 PM IST.
Harsh Parikh, Tech Lead at DevIT
Ok Thanks Radomir Mladenovic, Contegra Radomir ​. will take call on tomorow (13th December) 1:30 PM IST on skype.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Hope you are doing well.

We are start looking into the FTS module bug and for 2 bug we performed following scenario.

Scenario 1 :

  • Enter only  keyword test 
  • Click on Search button 
  • The API provides the result with data where language display English

Scenario 2 :

  • Enter only  keyword test 
  • Apply Filter language English
  • Click on Search button 
  • The API doesn't provide any data 

We are available 10:00 AM to 6 :00 PM IST from next week.

Please confirm.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , the indexer logs debug details with all data fields indexed - the debug was enabled by default so you should have this in the log. Please, make sure the value that was indexed for the record is the one you use in the filter. 
If you cannot see the error, please send me the log, index and the search payload you;re using and I'll try to reproduce.
Ketan Sondarva, Technical Project Manager at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

I would suggest to take a short call so we can show errors raised by Industrial team and then you can do further communication via this channel. 
Let me know your availability for tomorrow as Harsh has already sent you mail to meet once to understand issues and get solution asap.

Thanks,
Ketan Sondarva
Radomir Mladenovic, Contegra
Hi Ketan Sondarva, Technical Project Manager at DevIT Ketan , ok let's have a call 5PM IST today
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We have another call at 6:00 PM IST.  Can we take call at 4:00 PM IST today ?  It mean after 15 minutes.

Or we can take call by tomorrow between 11:00 AM to 4:00 PM.

Please confirm.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

As discussed in today's call, Following I have provided you database name and Documents path from server so you can test it.

Server IP : 10.68.138.11
Database Name :  ISLGRebuildFTS 


Server IP : 10.68.138.10
Documents Path : E:\ISLGRebuildDemo\wwwroot\DocumentsFTS

From above path you will find all html & pdf documents from server.


I will also post you all bugs which industrial team raised.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Following, I have mentioned the bugs description and issue which Industrail team raised.

1 ) issue no : 20579 - Search results with no pinpoint references

Steps to reproduce:
  • Go to full text search
  • type 'interpretation' in the search input field
  • submit your search


Result: I am taken to a search results screen with 5 document results. Only 1 document result contains a pinpoint reference.

Expected: For each document result there must be at least one pinpoint reference for the keyword appearance in the text.


2) issue no : 20572 - FTS > Cannot perform search with filters

Steps to reproduce:
  • Go to Full Text Search
  • Make a selection in the 'Language' field (ie: French)
  • Scroll down and select [Submit search]

Result: I am taken to the search results page. No results or message is presented.

Expected: I expect the see the documents with my selected language as a result for my search. The documents are displayed if i do not apply a filter.


3) issue no : 20586 - Search with basic filter does not work

Steps to reproduce:
  • go to full text search
  • Type the term 'test' in the search input
  • Select 'English' from the language field
  • submit the search


Result: I am taken to the search results page. Some of the search results are in not documents in 'English'

Expected: based on my applied filters I must see only documents containing the word test + with language = English


4) issue no. 20611 - Search > Not able to search without inputting a keyword

Steps to reproduce:
  • Go to FTS
  • Add any filtering parameters but not a keyword and press "search"


Result:Leads to an empty page with no results

Expected:Will see any results that correspond to the filtering parameters



5) issue no. 20612 : Search > Boolean search not working

Steps to reproduce:
  • Go to FTS
  • Set search type to "boolean"
  • Enter yellow OR blue in the keyword search and press search


Result:no results found

Expected:Will see results that correspond to the "blue" keyword


6) issue no. 20613  : Search > Fuzzy Typo not working

Steps to reproduce:
  • Go to FTS
  • Enter term "blun" into keyword search
  • Set fuzzy typo to include 1 letter
  • enter search


Result: No results found

Expected: Should see results that correspond as if I entered "blue"



Please let us know if you need anything else.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh , I made a change to the SearchController to accept searches without keywords.
The update is in the OneDrive folder "2021-01-06 FTS update"

2) issue no : 20572 - FTS > Cannot perform search with filters

Should be fixed with the update. For the index you sent me, the following test search returns 1 result:

{
    "searchRequest": "",
    "FilterStatement": {
        "type": "boolean",
        "operator": "or",
        "clauses": [
            {
                "type": "match",
                "exclude": false,
                "field": "language",
                "value": "French"
            }
        ]
    },
    "SortField": "sortingdate",
    "SortOrder": "desc",
    "PageSize": 50,
    "PageNum":0
}

3) issue no : 20586 - Search with basic filter does not work

I don't see any issue here - I'm getting only results with English.

My search payload:

{
    "searchRequest": "test",
    "FilterStatement": {
        "type": "boolean",
        "operator": "or",
        "clauses": [
            {
                "type": "match",
                "exclude": false,
                "field": "language",
                "value": "English"
            }
        ]
    },
    "SortField": "sortingdate",
    "SortOrder": "desc",
    "PageSize": 50,
    "PageNum":0
}

or you can send it as:

{
    "searchRequest": "test",
    "FilterStatement": {
                "type": "match",
                "field": "language",
                "value": "English"
    },
    "SortField": "sortingdate",
    "SortOrder": "desc",
    "PageSize": 50,
    "PageNum":0
}

In both cases I get 4 results, all English.

I suspect there's some issue with the structure of your request. Please send me the payload you're submitting to the service.

4) issue no. 20611 - Search > Not able to search without inputting a keyword
This looks like a duplicate of #2, should be working now.

5) issue no. 20612 : Search > Boolean search not working

Looks good to me. The following search returns 5 results for be, highlighting both "blue" and "yellow":

{
    "searchRequest": "yellow OR blue",
    "searchType": "Boolean",
    "SortField": "sortingdate",
    "SortOrder": "desc",
    "PageSize": 50,
    "PageNum":0
}

Please send your request payload.

6) issue no. 20613  : Search > Fuzzy Typo not working

Looks good to me. The following search returns 3 results, highlighting "Blue":

{
    "searchRequest": "blun",
    "Fuzzy": true,
    "SortField": "sortingdate",
    "SortOrder": "desc",
    "PageSize": 50,
    "PageNum":0
}

Please send me your payload.

I run all tests with the index you sent me.

I'm still investigating #1 and get back to you later on it.
Radomir Mladenovic, Contegra
1 ) issue no : 20579 - Search results with no pinpoint references

As discussed before, we need a better description of the HTML format from you. This issue is related to that.

The test file you sent us initially was simple:
  • <div> element with the paragraph number had "paraBullet" class, and
  • the following <div> element with the paragraph test had "ParaText" class
When you search for "interpretation", one of the documents without paragraphs highlight is "OTI-0082 - ILC Draft Articles on Diplomatic Protection (2006).html". Please, check this document and provide details how to extract paragraphs from it. To me, this document looks messy even visually - deep nesting, repeating paragraph numbers, etc.

To successfully extract paragraph content we really need description of the format(s) you're using. Get not the simplest, but the most complex examples of content to be handled, and explain how to get data from it - what makes a paragraph.

Thanks,
Radomir
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Here, I have attached description our html coding manual. The html structured defined by this coding manual.

In this document we covered all the scenarios. so please look into this manual for html description.

Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

Thanks for the document. From what I can tell, it focuses mainly on the visual aspects, not much info about extracting paragraphs data.

1. Check this screenshot from the above mentioned sample document. Two paragraphs are marked with (3), on the same level. My assumption was that paragraph numbers are unique. How should this be represented in index?


2. There's text from page footers, marked with class "pdffootnote", will be indexed as a part of the HTML. However, as far as I can say, this doesn't belong to any particular paragraph, so there will be no paragraph in search results for matches in this content. Makes sense?

3. If you have information which text belongs to which paragraph, can you add additional data attributes to content (e.g. similar to "data-key" attributes that  exist in the HTML) to make text extraction simple? For example, adding "data-para" attribute to a text div, where value is the paragraph number, would make the extraction way simpler.

Thanks,
Radomir
Harsh Parikh, Tech Lead at DevIT
Hi Piyush Rathod, DevIT Piyush   and Jitesh Dhuravala, DevIT Jitesh ,

Could you please provide answer to Radomir Mladenovic, Contegra Radomir  for above questions.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

For 2nd Question, You are right the second Paragraph no. 3 as per screenshot are footnote and we need only Paragraph for FTS search result. We don't need footnote. But, I need confirmation from Morgan.

Morgan Maguire, CEO Morgan   Please confirm.

For 3rd Point Radomir Mladenovic, Contegra Radomir , We can't do anything now with html structure as we have converted all the PDFs document into html by algorithm and manually. so it is not possible to set any attribute in html files.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir

Could you please send one drive link as we are unable to open the link. and please provide the bug no. which you resolved.

And please provide what we need to do ? Can we replace search controller from one drive ?

We are using following link but it will be not open.

https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=gfh2JQ


Is it possible to send only Search Controller file so we can replace it ?

Please provide One drive link.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh One Drive link is https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=2MNKXq

As of issue numbers, the only issue I see is with HTML parsing. Please check my previous comments where I listed your bugs numbers and search payload I used to test.
Morgan Maguire, CEO
Hi Harsh Parikh, Tech Lead at DevIT Harsh and Radomir Mladenovic, Contegra Radomir ,

Responding to Harsh Parikh, Tech Lead at DevIT Harsh 's response above: Re: Search Implementations - TOLOGIX - Contegra Search Audit, please note that all text in the documents needs to be indexed within the FTS, including footnotes, headings, TOC, etc. We need to ensure a result is displayed no matter the context of the text, similar to what it currently does for PDFs in the legacy application.

Thanks,

Morgan
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I'm trying to make sense of the document "OTI-0082 - ILC Draft Articles on Diplomatic Protection (2006).html " that you sent me on Jan 8.
I'm sending you paragraphs extracted from this document. The first column is the paragraph number, the second column is the paragraph test. Please review and correct what needed.

In this document I didn't include footnotes. The plan is to add each footnote to the paragraph that references it. However, we need to make a proper paragraph extraction first.

My assumption was that the paragraphs will have unique numbers withing the document. However, that doesn't appear to be the case. Check, for example, paragraph with number (1) - it appears in multiple parts of the document. That means in search results we should have multiple result paragraphs (1) for the same document, correct?
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh Morgan Maguire, CEO Morgan Any update/feedback on my question above about the paragraph extraction?
Morgan Maguire, CEO
Hi Radomir Mladenovic, Contegra Radomir , I'll let Harsh Parikh, Tech Lead at DevIT Harsh respond to this one. Note that Harsh Parikh, Tech Lead at DevIT Harsh was on leave Wednesday, Thursday and Friday last week.

Morgan 
Harsh Parikh, Tech Lead at DevIT
Jitesh Dhuravala, DevIT Jitesh   and Piyush Rathod, DevIT Piyush , Could you please provide your feedback on Radomir Mladenovic, Contegra Radomir 's question regarding html structure.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

There is one question for you. When Indexing will run every day then ParaId parameter will be change for same document ?

If yes then it will create an issue for us because we have bookmark functionality in application where user can bookmark any searched pinpoint paragraph.

We are using ParaId to get the HTML or PDF text.

For example, any user search the keyword and bookmark paragraph 8 in application.

When we saved the bookmark Paragraph 8 then html text is display through API. But if we check next day after regenerate the indexing the ParaId parameter which we saved is changed. 

We should same ParaId for all time to fetch the HTML or PDF text.

Please provide your feedback. Also, We are available to discuss on Skype between 11:00 AM to 6:00 PM IST.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh correct, paraId might change with index update. It's not a good candidate for use in bookmarks. I'd suggest using real paragraph number, but that also makes sense only if paragraphs are unique in the document.
Depending on feedback I get on paragraph extraction questions, we'll probably need to change indexing and how paragraphs are referenced in the index. I'd wait on these answers before making further changes.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

I talked with html team and as per your html provided above the Paragraph number 1's ids are different.

1st Paragraph Id :   pa1
2nd Paragraphed : pa1.1

so if any user search the keyword and it will  match in both paragraph then we should display both paragraph number in application.

Morgan Maguire, CEO Morgan  , Hope I am Correct.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We should stop to update the ParaId while indexing are regenerate because we are using paraid in bookmark functionality to retrieve html text.
Morgan Maguire, CEO
Hi Harsh Parikh, Tech Lead at DevIT Harsh and Radomir Mladenovic, Contegra Radomir ,

Following up on the above, yes, it will be common in certain documents for the same paragraph number to be used across the same document, particularly within Treaties and Arbitration Rules. This is why paragraph IDs need to be used as the unique identifiers for paragraphs and footnotes across a document.

For example, within ARB/0029 - ICSID Arbitration Rules (2006) you see that each rule contains subparagraphs that are numbered according to conventional bulleting where the numbering restarts under each rule. As a result, to ensure each subparagraph can be uniquely identified, we have inserted references to relevant rule or section into each paragraph ID:

Therefore, as Harsh Parikh, Tech Lead at DevIT Harsh has already pointed out, paragraph IDs need to be used for indexing purposes.

Radomir Mladenovic, Contegra Radomir , will using paragraph IDs cause a problem for the indexing? Why aren't paragraph IDs good candidates for bookmarks?

Thanks,

Morgan
Radomir Mladenovic, Contegra
Thank you Morgan Maguire, CEO Morgan , that's a useful explanation.

ParaId that I said is not convenient for bookmarking is not your paragraph ID. It's dtSearch document number which we used for fast retrieval of found paragraphs. For bookmarking, you should use your paragraph ID and I'll look into changing web service to support data retrieval using this id.

How should footnotes be indexed and referenced? Should they be indexed with a paragraph referencing them, or separately?
Morgan Maguire, CEO
Sounds good, Radomir Mladenovic, Contegra Radomir .

The footnotes should be indexed separately without the paragraph that references them. In other words, we want the user to be directed to the text of the footnote itself if that produces a hit for the searched keyword. Does that make sense?

Morgan
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We are using same ParaId for retrive text from HTML & PDF in bookmark which we use in FTS module.

So my assumption is you will change this in your web service and it will work in both FTS and bookmark.  There is no need to change anything in application.

Please note that we save the URL in database where we used ParaId Parameter for bookmark.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

There are 2 bugs produced by Industrial Team. Please look into this.  

To check both issues you can use ISLGRebuild database on server. and all PDF & Html documents you can find from following path : E:\ISLGRebuildDemo\wwwroot\Documents

21136 : Search > All words > no pinpoint references

Steps to reproduce:
  • go to FTS
  • enter the terms 'greek and september or tribunal' into the search field
  • select 'All words' as the search type
  • submit the search

Result: Search results cards  do not include pinpoint references

Expected: A paragraph link must exist for each paragraph in the document where at least one of the  keywords from the search was found




21138 : Search > Keyword not highlighted


Steps to reproduce:
  • Go to FTS
  • eneter the term 'like' into keyword input
  • submit search
  • from search results, find a result with pinpoint references
  • Select a pinpoint reference to preview excerpt


Result: No keyword is highlighted in the result. The keyword does not seem to appear in the paragraph.

Expected: Only paragraphs with keyword appearances will be available as pinpoint references to preview excerpts. The keyword must be highlighted in the excerpt preview.


Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

I made changes to the HTML parser. To see how extraction worked for your sample file, check:

On the OneDrive (https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=2MNKXq) under the folder "2021-01-23" you can find updates for the indexer and web service.

Before indexing with the new indexer, delete FTS index folders completely - new index is needed as there have been some changes in the structure.

In the results, paraId is now a string, not a number, and it looks like this:


In the paragraph highlighting request you also need to send this id:


Let me know if you have any question.


As of the bugs you sent:

21136  "A paragraph link must exist for each paragraph in the document where at least one of the  keywords from the search was found"

This is new requirement for me. We'll need to parse complex queries that contain multiple terms and boolean expressions to extract keywords only and then find paragraphs. I'll let you know when this is ready.

21138 I'm not getting any results for "like" so cannot reproduce this. 
Radomir Mladenovic, Contegra
I updated the service with a quick fix for bug 21136, let's see if that helps.
Harsh Parikh, Tech Lead at DevIT
OK Thanks Radomir Mladenovic, Contegra Radomir . We will check this thing by Wednesday and will update you as Tomorrow we have national holiday.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We need to discuss following bugs with you on skype call. Please provide your confirmation. We are available between 11:00 AM to 6:00 PM IST.

21138 Search > Keyword not highlighted

Steps to reproduce:
  • Go to FTS
  • eneter the term 'like' into keyword input
  • submit search
  • from search results, find a result with pinpoint references
  • Select a pinpoint reference to preview excerpt


Result: No keyword is highlighted in the result. The keyword does not seem to appear in the paragraph.

Expected: Only paragraphs with keyword appearances will be available as pinpoint references to preview excerpts. The keyword must be highlighted in the excerpt preview.





21217 : Preview excerpt > must not be displayed if not keyword was included in query

Steps to reproduce:
  • Go to FTS
  • Submit a search with no keyword 

Result: Search results include pinpoint references to preview excerpts

Expected: No preview excerpts should be available because I have not included a keyword in my search query



21218 : empty results appearing in search


19515 : Any words search > no highlight (In Subject Navigator Index)

Steps to reproduce:
  • Go to the subject navigator
  • expand the search options accordion below the search bar
  • select 'All words' option
  • in the search bar enter the terms 05 bridgestone
  • hit enter

Result: The search is performed and the tree is filtered to display only matching branches (and parent branches of). No matching terms are highlighted in the results.

Expected:
  • terms that match the search will be highlighted within the results




20613 : Search > Fuzzy Typo not working (In Subject Navigator Index)


Steps to reproduce:
  • Go to FTS
  • Enter term "blun" into keyword search
  • Set fuzzy typo to include 1 letter
  • enter search


Result:No results found

Expected:Should see results that correspond as if I entered "blue"


20867 : SN > Fuzzy typo not working (In Subject Navigator Index)
 
Steps to reproduce:
  • Go to SN
  • Set search setting for fuzzy typo to 2
  • type in "awurd"


Result:No results found

Expected:Should see results for "award"



Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , are these issues reproducible in the database I can access to? Please let me know which database and documents path to use. 

We can talk tomorrow about 13:30 IST but please provide me with the examples to test today. There's no much sense is having a call if I have no access to test data.

Thanks,
Radomir
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

You can check those bug on database : ISLGRebuild (server ip : 10.68.138.11)
The Document Path : E:\ISLGRebuildDemo\wwwroot\Documents


We will discuss more by tomorrow 1:30 PM IST.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

21217 : Preview excerpt > must not be displayed if not keyword was included in query

This was intentional as you didn't specify desired behaviour when there is not search query. On OneDrive (https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=2MNKXq) under the folder "2021-01-28" you can find update web service that fixes this.

Other issues that you mention I could not reproduce as I couldn't find the keywords you mention in the index. I guess we're still indexing different data?

For tomorrows call, please prepare to send me indices for FTS and subjectnav, as well as indexing logs created indexing one and the other collection.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

As discussed in today's call, Here I have attached 2 zip Folder.

1) Full Text Search (Include Indexes folders & indexlog File)
2) Subject Navigator (Include Indexes folder & indexlog File),




Also, I have attached JSON format which we passed to Webservice.

Fuzzy Type Bug :

{"searchRequest":"blun","SearchType":"3","Stemming":false,"Synonyms":false,"Fuzzy":true,"Fuzziness":"1","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"DocumentContentTypeId","values":["37","13","12"]}]},"PageNum":0,"PageSize":20}


Highlight issue with like keyword

{"searchRequest":"like","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"DocumentContentTypeId","values":["37","13","12"]}]},"SearchType":"3","Stemming":false,"Synonyms":false,"Fuzzy":false,"Fuzziness":"1","SortField":"hits","SortOrder":"desc","PageNum":0,"PageSize":"20"}
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

1) Your FTS index was not created with the latest update that I sent you on Jan 23. I can tell that by "#para_" appearing in the paraId string. I also sent you sent you source code update for it so you can see for your self. The update was also trimming extra whitespace that appears in some HTML element IDs. As your index was not created with the latest update, there's no much sense testing it.

I've noticed in your index that some paragraph IDs even use symbols (I saw some kind of dot). I'm not sure if dtSearch will properly work finding those. To prevent issue with these, I made a change to the indexer to encode both file path and paragraph.

Please, take the indexer update from OneDrive folder 2021-01-30, delete old FTS indexes and re-index!
https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=2MNKXq

2) You cannot search for keyword "like" specifically because it's a stop word, listed in the file noise.dat (dtsearch default list). Review that file and remove all keywords you want to use in search - or delete the file.
After that do full re-indexing!

3) Why do you think that "blun" should find "blue" with Fuzziness=1? Did you find this example in the dtSearch documentation? I see that it finds it when Fuzziness=2 so I'd say that fuzzy search works.

4) Subject Navigation highlighting for more than one term works for me:

5) Why do you send "SearchType": "3" in your search request? Are you sure this binds to the desired search type?

 
Harsh Parikh, Tech Lead at DevIT
Hi Melissa Cowell, General Manager at Industrial Melissa and Savannah Mitchell, Project Manager at Industrial Savannah ,

Please see following answer of Radomir Mladenovic, Contegra Radomir for Fuzzy typo bug : (Bug No: 20867 & 20613)

Why do you think that "blun" should find "blue" with Fuzziness=1? Did you find this example in the dtSearch documentation? I see that it finds it when Fuzziness=2 so I'd say that fuzzy search works.


Alos, Melissa Cowell, General Manager at Industrial Melissa , Please look into the Radomir Mladenovic, Contegra Radomir   answer for bug no : 21138

You cannot search for keyword "like" specifically because it's a stop word, listed in the file noise.dat (dtsearch default list). Review that file and remove all keywords you want to use in search - or delete the file.
After that do full re-indexing!.  We will re-indexing by this week and let you know once it will be completed.


Radomir Mladenovic, Contegra Radomir ,  Could you please let us know how we can remove noise words from indexing ? We have already removed those noise words from DtSearch Desktop version which we used for old legacy application. But, We not sure for this new rebuild indexing to how to remove noise words list. Please let us know.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh the indexer takes noise.dat from the application folder (the folder containing indexer exe). I have this file in my test folder on the server so I guess you have it as well because the same distribution was sent to you.
Harsh Parikh, Tech Lead at DevIT
Thanks Radomir Mladenovic, Contegra Radomir .  We have removed all noise words from Indexer.

Melissa Cowell, General Manager at Industrial Melissa , When we will release next build  please take note that on following point for bug no. 21138

You cannot search for keyword "like" specifically because it's a stop word, listed in the file noise.dat (dtsearch default list). Review that file and remove all keywords you want to use in search - or delete the file.
After that do full re-indexing!.  We will re-indexing by this week and let you know once it will be completed.
Morgan Maguire, CEO
Hi Harsh Parikh, Tech Lead at DevIT Harsh and Radomir Mladenovic, Contegra Radomir ,

Re bug no. 21138, please note that in the legacy application, we have removed all keywords from the noice.dat file. Please do the same for the new application to ensure keywords similar to "like" will generate hits.

Re bug no. 20613, Radomir Mladenovic, Contegra Radomir and Rob Wiesenberg, Contegra Rob could you please explain why a search for "blun" did not produce a result for "blue" when fuzzy typo = 1 was enabled? If you read through the dtSearch support page on fuzzy searching, it seems like this should have produced a hit: https://support.dtsearch.com/webhelp/dtsearch/fuzzy_searching.htm

Thanks,

Morgan
Radomir Mladenovic, Contegra
Hi Morgan Maguire, CEO Morgan and Harsh Parikh, Tech Lead at DevIT Harsh ,

I understand your question about the fuzzy search but we don't have an insight into dtSearch internal implementation. The web service passes parameters to the dtSearch API and apparently that works, considering there are results for fuzziness=2.
Unless Rob Wiesenberg, Contegra Rob has an answer, you'll have to address this question to dtSearch support.

Thanks,
Radomir
Morgan Maguire, CEO
OK. Sounds good, Radomir Mladenovic, Contegra Radomir . Unless, Rob Wiesenberg, Contegra Rob has further insight, let's move on and consider bug no. 20613 resolved.

At the same time, please note my instructions on bug no. 21138 and removing all keywords from the noise.dat file.

When can we expect the FTS and Subject Navigator searches to be fully implemented within staging.investorstatelawguide.com?

Thanks,

Morgan
Rob Wiesenberg, Contegra
Morgan Maguire, CEO Morgan , I believe that the fuzzy matching is based on the percentage of the search term that matches term in the text. This might explain why a four letter search term with one incorrect letter does not match when the fuzziness is set to 1. Also there is an API call that uses % value. I am waiting for confirmation from dtSearch. Will let you know. 
Morgan Maguire, CEO
Thanks Rob Wiesenberg, Contegra Rob . Any clarification we could relay to users would be appreciated.

Morgan
Radomir Mladenovic, Contegra
Morgan Maguire, CEO Morgan Harsh Parikh, Tech Lead at DevIT Harsh   I don't think there's any unaddressed bug or unfinished functionality on my end. Please, let m know if there's anything else.
Morgan Maguire, CEO
Ok. Sounds good, Radomir Mladenovic, Contegra Radomir .

Harsh Parikh, Tech Lead at DevIT Harsh , let us know when the updated version of the  Subject Navigator and FTS searches are deployed to staging.investorstatelawguide.com, and Melissa Cowell, General Manager at Industrial Melissa and Naomi Joanis, UX Team Lead at Industrial Naomi can complete their UAT.

Thanks,

Morgan
Rob Wiesenberg, Contegra
Morgan Maguire, CEO Morgan dtsearch has confirmed that the 1-10 fuzzy designation is not based on a letter count. 

The correspondence between search fuzziness with the default 1 to 10 setting and character discrepancies is not "1 to 1"

However, you can fine-tune fuzziness "manually" with the % character. Please see pages 42-43 of https://support.dtsearch.com/faq/dtSearch_Desktop.pdf for more on this.
Morgan Maguire, CEO
OK. Thanks Rob Wiesenberg, Contegra Rob .

Melissa Cowell, General Manager at Industrial Melissa and Naomi Joanis, UX Team Lead at Industrial Naomi , we should take note of this and make the appropriate updates to the sections in the Knowledge Centre that address these options in the Full Text Search and Subject Navigator searches.

Thanks,

Morgan 
Naomi Joanis, UX Team Lead at Industrial Noted!
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Today, We are going to create Indexing on server. But, Don't know it is not created any module indexing.

Following, I have mentioned Server Details :

Server IP : 10.68.138.10
TologixDBIndexer : E:\DevContegraISLGRebuildStagingDBIndexer
Document Path :   E:\ISLGRebuildStaging\wwwroot\Documents\
Indexing Path : E:\DevContegraISLGRebuildStagingIndexes\


Database Name :  ISLGRebuildStaging (Server 10.68.138.11)

Please take note that now we are going to create indexing on migrated data. so the amount of data is large.

Please look into this as high priority as tomorrow we need to deploy the project on staging server.

Also, Here, I have attached Indexing log after tried to generate indexing.

Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh  

> Today, We are going to create Indexing on server. But, Don't know it is not created any module indexing.

You can put indexer and index files wherever you like, the same as you did so far. I really don't understand what are you asking me to do here. 
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Today we are going to Indexing of migrated data. We are using database ISLGRebuildStaging on server 10.68.138.11.

Please note that now we are using migrated data. Hence the amount of data is so large.

Here, I am attached FTS module Indexing and Indexing log file.

But, some how the indexing was not generated properly for all modules.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh it looks like your query is loo slow so the query timed out. There's an error in the log:

2021-02-04 05:24:12,541 [1] ERROR - Failure in retrieving data from the database
System.Data.OleDb.OleDbException (0x80040E31): Query timeout expired
   at System.Data.OleDb.OleDbDataReader.ProcessResults(OleDbHResult hr)
   at System.Data.OleDb.OleDbDataReader.NextResult()
   at System.Data.OleDb.OleDbCommand.ExecuteReaderInternal(CommandBehavior behavior, String method)
   at System.Data.OleDb.OleDbCommand.ExecuteReader(CommandBehavior behavior)
   at System.Data.Common.DbDataAdapter.FillInternal(DataSet dataset, DataTable[] datatables, Int32 startRecord, Int32 maxRecords, String srcTable, IDbCommand command, CommandBehavior behavior)
   at System.Data.Common.DbDataAdapter.Fill(DataSet dataSet, Int32 startRecord, Int32 maxRecords, String srcTable, IDbCommand command, CommandBehavior behavior)
   at System.Data.Common.DbDataAdapter.Fill(DataSet dataSet, String srcTable)
   at DBIndexer.SampleDataSource.RetrieveDataFromDB() in p:\contegra\contegra-tologix\DBIndexer\SampleDataSource.cs:line 582

According to the time of log messages, looks like the query expiration is 30 seconds.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Now, Data size increase. so you need to check all the queries and views which we provided to you and increase the  time to generate indexes. Because due to large data the views nd query is taking time to return all the data.


1) Subject Navigator
2) Dispute Document
3) Full Text Search.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh

You can get the indexer update from 2021-02-04 folder on OneDrive
https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=2MNKXq

I enabled unlimited timeout for the sql command.

BTW I recommend you to edit TologixDBIndexer.exe.conf and set the default logging level to INFO. Currently, it's on DEBUG and your indexing log would be huge.

How long does it take to FE_MetafieldwithValueDynamicFTS to complete and start returning data? I startedthe process almost half an hour ago and still waiting....
Radomir Mladenovic, Contegra
Documents finally started indexing after about half an hour. I've noticed some HTML parsing error related to paragraph extraction and for now made a quick fix that it doesn't break (the update is in the above mentioned folder) but extraction on paragraphs needs to be checked.
Radomir Mladenovic, Contegra
I looked into details of one breaking HTML. The problem I see is inconsistent use of "paralvl" classes because there's nested in unexpected order - for example paralvl1 can be found as a child of paralvl2 level. But in some documents is the other way around. This causes a big issue in extracting paragraph number and not all level are properly collected (see 4th column "paraFiltered" in the attached file).

However, looking at the attached example, I think that the 3rd column ("idFiltered") makes more sense to return to user as the paragraph number. It better indicates paragraph, footnote, etc.

Morgan Maguire, CEO Morgan Harsh Parikh, Tech Lead at DevIT Harsh   Please let me know what you think.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh you can go and create indexes regardless of how we proceed with showing paragraphs. I already index both fields so we need just a change in the search service to swap them in the results.
Morgan Maguire, CEO
Hi Radomir Mladenovic, Contegra Radomir ,

I think this is really a question for Harsh Parikh, Tech Lead at DevIT Harsh and Jitesh Dhuravala, DevIT Jitesh to determine, but I do see any issues with using the idFiltered column for identifying the appropriate paragraphs. This should contain all the unique ID properties that will prevent any duplicate IDs within the same document.
 
Thanks,

Morgan 
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The Full Text Search module indexing is created now after replace the TologixDBIndexer.exe and also we replace to IFNO in config file.

But, Subject Navigator module is not generateing the Indexing.

The Indexing log says The Table and View not found. But, We have vw_SubjectNavigatorSearch on ISLGRebuildStaging Databse.

The Index log file.

2021-02-05 02:11:45,989 [1] INFO  - Execute IndexJob
2021-02-05 02:11:46,320 [1] INFO  - Retrieve data from database
2021-02-05 02:12:17,008 [1] WARN  - No tables/views found
2021-02-05 02:12:17,141 [1] INFO  - Done


Could you please check and confirm ?
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I have no access to your environment before the late afternoon. In the meantime, please check your environment and database access rights. Nothing changed in the indexer code that would affect finding the table. Verify the indexer config file as well.
For this case you may want to enable DEBUG log to check any other messages.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir

I have checked all things are OK. Can we take quick call to check ?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The following 2 modules Indexing are not created and indexing message says  :

1) Dispute Document
2) Subject Navigator

Can we take call on Monday (8th February) between  11:00 AM to 6:00 PM IST ? Please provide your confirmation 

Indexing message :
2021-02-05 02:11:45,989 [1] INFO  - Execute IndexJob
2021-02-05 02:11:46,320 [1] INFO  - Retrieve data from database
2021-02-05 02:12:17,008 [1] WARN  - No tables/views found
2021-02-05 02:12:17,141 [1] INFO  - Done
Radomir Mladenovic, Contegra
Yes, we can talk on Monday after 13:00 IST
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh
The Subject Navigator indexing was also suffering from the time out issue. I set it to unlimited. You can download indexer update from folder "2021-02-06".

I have no idea why you got the message that to tables/views were found. I'm also sending you my indexer config file in the above mentioned folder.

Maybe you should look into speeding up the view with indexes or something. I run the indexer and it took about 40 minutes to start receiving the data, and about 20 to index.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I also sent you the SearchController updated touse paragraph id from the HTML as the paragraph number (instead the paragraph extracted from text as it's not reliable).
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Can we take call ? We are available to discuss regarding SN & Dispute Document Library Indexing . 
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The Subject Navigator Indexing are generated now. But, the Dispute indexing not generated.

  • indexer-config-disputes

Following is indexing log while we tried to generate Dispute Indexing :

2021-02-08 03:41:06,132 [1] INFO  - Execute IndexJob
2021-02-08 03:41:06,432 [1] INFO  - Retrieve data from database
2021-02-08 04:24:44,463 [1] ERROR - Failure in retrieving data from the database
System.Data.OleDb.OleDbException (0x80040E14): Could not allocate space for object 'dbo.WORKFILE GROUP large record overflow storage:  149430086533120' in database 'tempdb' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.
   at System.Data.OleDb.OleDbDataReader.ProcessResults(OleDbHResult hr)
   at System.Data.OleDb.OleDbDataReader.NextResult()
   at System.Data.OleDb.OleDbCommand.ExecuteReaderInternal(CommandBehavior behavior, String method)
   at System.Data.OleDb.OleDbCommand.ExecuteReader(CommandBehavior behavior)
   at System.Data.Common.DbDataAdapter.FillInternal(DataSet dataset, DataTable[] datatables, Int32 startRecord, Int32 maxRecords, String srcTable, IDbCommand command, CommandBehavior behavior)
   at System.Data.Common.DbDataAdapter.Fill(DataSet dataSet, Int32 startRecord, Int32 maxRecords, String srcTable, IDbCommand command, CommandBehavior behavior)
   at System.Data.Common.DbDataAdapter.Fill(DataSet dataSet, String srcTable)
   at DBIndexer.SampleDataSource.RetrieveDataFromDB() in p:\contegra\contegra-tologix\DBIndexer\SampleDataSource.cs:line 585
2021-02-08 04:24:44,582 [1] INFO  - Done
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , the indexer doesn't do anything on its own with the DB, it only invokes your view or stored prod. I'm afraid there's nothing I can do to fix the above error, it's on your end (in the database settings or some optimization).
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The Subject Navigator Indexing are created but we find any word the Web service returning the following error :

You can Find SubjectNavigator Indexing on following path : 

Server : (10.68.138.10)
Indexing Path : E:\DevContegraISLGRebuildStagingIndexes\SubjectNavigatorIndex


Error : 

Unable to access index  D:\Contegra Indexes\tologix-index-subject\ D:\Contegra Indexes\tologix-index-subject\index_r_1.ix file is truncated.  Committed size=79508586 Actual size=58556416 (file: index_r.ix); No files retrieved in search.

Please look into this and provide the solution.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh that looks like dtSearch index got corrupted. Did the indexing process complete normally? Are you testing it on the same system where it was indexed? If it was copied to another system, maybe the copy did not complete?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We copied indexing from server to local system. Could you please check this indexing generated properly or not ?

The indexing is generated in following path on server :

Indexing Path : E:\DevContegraISLGRebuildStagingIndexes\SubjectNavigatorIndex
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , the index on the server looks fine to me. In its log file, you can see that the index file was 79508586:

and the current size is as it says later in the log, I guess after you indexed more data:

I guess your copying process failed.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The Dispute Document View is taking so much time to execute and get the data so we have optimized using tempTable.

For using tempTable, We need to convert into Stored Procedure.

If we assign stored procedure in place of View name in following file in TologixDBIndexer Project then it will work ?

File Name : indexer-config-dispute-docs.json

We replace the following things :

Current : "IndexTablesViews": [ "VW_DisputeContegraSearch" ],


Replace :  "IndexTablesViews": [ "SelectDisputeContegraSearch" ],

We will use Stored procedure in place of View.  The Column we are getting in Stored procedure are same as View.

You can check this thing on ISLGRebuildStaging Databse.

Please let us know.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh instead of the "IndexTablesViews" parameter, you should be using "IndexStoredProcTables" - as we did in case of the disputes indexing I believe. With that it should work.
Harsh Parikh, Tech Lead at DevIT
OK Thanks Radomir Mladenovic, Contegra Radomir . I hope It will work. We will check and will update by tomorrow if in case we find any issue.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Could you please try to Search this thing in Dispute Document Library.

Database : ISLGRebuildStaging on server (10.68.138.11)


Search Keyword: 9REN
Lanaguage : Spanish

We are not finding any result on above search data.

Please provide your feedback
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I indexed data, reviewed logs, tested search and you're right - there are no results for your sample query. However, as far as I see, the problem is with data coming from database and your indexing config files.

Per your config, document identifier column for indexer is "ContentTypeDataMasterID". However, stored procedure "FE_MetafieldwithValueDynamic" provides IDs which are not unique in the results. For example, ContentTypeDataMasterID 12400 appears 25 times.
Whenever a row with the same ID appears, it will override previously indexed row with the same ID. In your case, because of all the duplicates, after 89434 rows indexed, there are only 9556 in the generated index. Almost 90% of data was overwritten.

Looks like you have the same problem with the VW_DisputeContegraSearch view.

We had the same discussion about this around April 17 last year. Please reference comments above around that date. Back then it was said that you will use add "RowId" as an identifier column. It see that FE_MetafieldwithValueDynamic has it, but VW_DisputeContegraSearch does not. After adding the column, update your config files ("DocIdColumnName": "RowId") and re-create Dispute indexes.

Hope this helps.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Due to previous view there was issue to create indexing to large data so we are using the stored procedure (SelectDisputeContegraSearch) in place of VW_DisputeContegraSearch.

Can we take call today between 1:00 PM to 6:00 PM today to resolve this issue ?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We update Indexing file of Dispute (indexer-config-disputes.json) and update "DocIdColumnName": "RowId" but still it does not work.

Please we need to take call to resolve this issue. We are available between 1:00 PM to 6:00 PM IST.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , please send me your indexing config files you're using now for the disputes and dispute-docs index. I'll generate indexes and review.
I'll be available for a call within the next two hours but I still need your config files.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

As per call, Here, I have attached both Indexing file after updating RowId.



Database Name : ISLGRebuildStaging

For dispute Json, we are using  this SP : FE_MetafieldwithValueDynamic
For dispute-docs Json, We are using this SP : SelectDisputeContegraSearch

Both SP have unique Identifier RowId.

We checked after re-indexing but it stills does not work.

Please let us know your inputs.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh   how did you manage to re-index so quickly? For me it took yesterday at least an hour to create both indexes. Anyway, I'll get back to you as soon as generate indexes and review.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We have made new query SelectDisputeContegraSearch in place of view VW_DisputeContegraSearch.

We are using this query SelectDisputeContegraSearch for dispute-doc indexing.

IF you will indexing both queries then within 20-25 min the indexing will be done.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh   did you run this on "ISLGRebuildProduction" as in the config you sent me? Because for me the query fails:

ERROR - Failure in retrieving data from the database
System.Data.OleDb.OleDbException (0x80040E14): Could not allocate space for obje
ct 'dbo.WORKFILE GROUP large record overflow storage:  141537287733248' in datab
ase 'tempdb' because the 'PRIMARY' filegroup is full. Create disk space by delet
ing unneeded files, dropping objects in the filegroup, adding additional files t
o the filegroup, or setting autogrowth on for existing files in the filegroup.
   at System.Data.OleDb.OleDbDataReader.ProcessResults(OleDbHResult hr)
   at System.Data.OleDb.OleDbDataReader.NextResult()
   at System.Data.OleDb.OleDbCommand.ExecuteReaderInternal(CommandBehavior behav
ior, String method)
   at System.Data.OleDb.OleDbCommand.ExecuteReader(CommandBehavior behavior)
   at System.Data.Common.DbDataAdapter.FillInternal(DataSet dataset, DataTable[]
datatables, Int32 startRecord, Int32 maxRecords, String srcTable, IDbCommand co
mmand, CommandBehavior behavior)
   at System.Data.Common.DbDataAdapter.Fill(DataSet dataSet, Int32 startRecord,
Int32 maxRecords, String srcTable, IDbCommand command, CommandBehavior behavior)
   at System.Data.Common.DbDataAdapter.Fill(DataSet dataSet, String srcTable)
   at DBIndexer.SampleDataSource.RetrieveDataFromDB() in p:\contegra\contegra-to
logix\DBIndexer\SampleDataSource.cs:line 585
INFO - Done
Radomir Mladenovic, Contegra
and indexer for dispute-docs fails as well:

DEBUG - Getting properties for db row 0 in table SelectDisputeContegraSearch
ERROR - Object reference not set to an instance of an object.
System.NullReferenceException: Object reference not set to an instance of an obj
ect.
   at DBIndexer.SampleDataSource.GetNextDoc() in p:\contegra\contegra-tologix\DB
Indexer\SampleDataSource.cs:line 156
INFO - Done

I'll try indexing staging DB.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We are able to generate indexing. What we will do now ?
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh as I asked above, are you indexing staging or production database?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We created Indexing in our local environment but we used same databse ISLGRebuildStaging in our local environment.

Please try to create Indexing using ISLGRebuildStaging Databse.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I did indexing of Disputes using the stage database.
Search for "9REN" returns 11 results. However, when you add a language filter it doesn't find anything. The problem appears to be "SelectDisputeContegraSearch" which doesn't provide the language - it's always empty, that's why it cannot be found. (The Language that you see in the dispute results are coming from the dispute-docs index but the field has to be in the disputes as well as that's where we apply the filter.)
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The above indexing result is Dispute Data only we also need Dispute Document data.

The Language we are finding from Dispute Document Data. One Dispute entry  is associated with multiple Dispute Documents.

We have changed the Query logic of FE_MetafieldwithValueDynamic. Now each value has one row. Because if we using Pivot in query it is taking so much time due to data.

So we modified the query and make each value has each row. if you find RowId (211742, 211743, 211744) the language data is available. so when user search with 9REN searchrequest and language Spanish then all data we need.


For Dispute Indexing the Row Id for Documents are (51167,
51168,
51169,
51170,
51171,
51172,
51173,
51174,
51175,
51176,
51177,
51178,
51179,
51180,
51181,
51182,
51183,
51184,
51185,
51186,
51187,
51188,
51189 )

and Dispute Document indexing the is the RowId Documents  are(
211742, 211743, 211744
)

We need all Dispute & Dispute Document Data while user combined the search

Please suggest.

We are able to take call and discuss.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

On September 9 you wrote: "As discussed, You will get one column DisputeId from 1st SQL View and you need to pass DisputeId column value to 2nd SQL View and provide us JSON format result to bind data as per our model."

That's how it works:
  1. The keyword search and filters are run against the "disputes" index. (For indexing we're using FE_MetafieldwithValueDynamic, correct?)
  2. All different DisputeId's are collected.
  3. In the "dispute docs" index, which was created using "SelectDisputeContegraSearch" (correct?), we find all documents where ContentTypeDataMasterId matches one of the collected DisputeId's. - BTW, as a reminder, your team made this change back in September!
  4. Matches from the "dispute docs" are returned as a result.
Row 211743 that you mention above is in the dispute docs index but I don't see any language fields in the disputes data. To the 2021-02-23 folder on OneDrive I put indexing logs. Please check the indexing-disputes-stage.log and tell me which data row corresponds to row 211743 in the dispute docs.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Thanks for update. We have updated FE_MetafieldwithValueDynamic query same as previous and we fetched data as per ContenttypedataMasterId. Now no need to change in config file.

But, Now for 2nd Query SelectDisputeContegraSearch we need all ContenttypedataMasterId result.

Suppose, in second query if Claimant Column data in multiple row so we need all rows in model.

Currently, your API returns only one Row.

Please let us know so we can take call and communicate.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , there are two things here:
  1. Which field in the 1st index ("disputes") identifies records you need in the results? (currently it's "DisputeId" - is that the one you need?)
  2. Which field in the 2nd index ("dispute docs") should be matched with the above collected values? (currently it's "ContentTypeDataMasterId")
Can you answer on these?

If #1 needs to be changed:
  • The property "FacetedFields" in the indexer config file needs to be updated as it's set to "DisputeId" currently.
  • Method SearchDisputes in the SearchController needs to be updated to use the proper field.
If #2 needs to be changed:
  • Method SearchDisputes in the SearchController needs to be updated to use the proper field.
Sorry, I'm not available for a Skype call today. In any case, I hope it's not needed as I believe you have all information you need - you have been modifying the SearchController before.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The Dispute Document issue we resolved. We added RowId for SelectDisputeContegra (2nd Query)  and change in config file for disp-doc to set "DocIdColumnName": "RowId" and it seems work now.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The Other issue we found that if we search keyword with "Tribunal" in Full Text Search module. You API is taking 8 to 10 second to just give response to us. the total record we found near about (6000).

Could you please improve that and provide updated Search Controller ?

You Can use database ISLGRebuildStaging. 
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , how many results are you pulling when you search for "Tribunal"? Can you send me your results JSON for this search? 
(to test it on my own I'd have to create full index all first)
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Near about 6000 total records we are displaying on FTS Page after convert JSON in to Model. Following is the JSON for this search.



Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh which database and documents folder should I use to generate the FTS index? If you have a copy of the index on 10.68.138.10 then just let me know where so I could copy it.

Considering the huge results response, the search time is not that surprising, especially because we don't limit results in order to have proper sorting by field. Do you use the option to sort search results? By which field do you sort? If you can, please send me the complete search request json.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

You can use following database to generate Indexing.

Database Name : ISLGRebuildStaging (10.68.138.11)


The Document Path is Following on Server ip (10.68.138.10)

Document Path :  E:\ISLGRebuildStaging\wwwroot\Documents

By default, We sorting on Relevance. Following is Criteria.

  • Relevance (default) - relevancy criteria based on current functionality of "hit count".

Following is the JSON File which we passed when we search with "Tribunal" word.

Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , thanks, I'll generate index and download for local testing. It may take a while so I will probably not have an answer by the end of your work day today.
Harsh Parikh, Tech Lead at DevIT
OK Radomir Mladenovic, Contegra Radomir . Please provide your inputs by tomorrow to resolve this thing.
Darsh Shah, DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

I am the QC guy from DEVIT side. I have raised one issue for FTS Search :

Step to Reproduce:

1) Go to FTS module
2) Search the text "Tribunal AND Absence" with selecting "Boolean" search type
3) Click on Search button

Actual Result: System displays the files which has Tribunal or Absence word.
Expected Result: System should display the files which has Tribunal and Absence both words.
Darsh Shah, DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

As per your comment for "21138 - Search > Keyword not highlighted" issue, we delete the noise word and did re-indexing.
After completion of re-indexing, when the user search the noise keyword "Like", Search is working properly but a "Like" word is not highlighted in the result.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh the performance issue comes from the fields highlighting. dtSearch does not support fields-only highlighting so, they way it's implemented now, is that we highlight the whole cached document and then extract the fields part. The problem is documents are big (I see each 1-2MB each) and it's a lot of time wasted for something we throw away. I have an idea how to approach this but need several hours so it will not be ready today. I should an update by Monday at latest.

@Darsh 
1) Search for "tribunal" gives 6059 results, search for "tribunal AND absence" gives 2137 results. By the numbers, it looks fine. Can you give me more details about the context where you see this issue?
2) That still sounds like the index was generated with stopwords list. How many results you have searching for "like"? If it gives back all documents and all paragraphs, it's still there as a stopword.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

OK. Please provide the update on Monday.

For Darsh Issue,

He said that when we search with "tribunal AND absence" There are 2137 result found and then user click on any paragraph only "tribunal" word is highlighted. But, As per expectation only those paragraphs need to be find where both "tribunal AND absence" words are available.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh , for the Darsh issue, on Jan 22 you said for issue #1136 "A paragraph link must exist for each paragraph in the document where at least one of the  keywords from the search was found"
Isn't that in conflict with this issue where you want only those paragraphs where both are found?
Harsh Parikh, Tech Lead at DevIT
Radomir Mladenovic, Contegra Radomir , but if we apply dtsearch rules that user search with "And" keyword (ex. tribunal and Greek) with Boolean then our expectation is, if both word matches in html or pdf then only those documents we need to display.

I am not 100% sure regarding Dtsearch rules. Morgan Maguire, CEO Morgan , Please suggest.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh two separate (and a bit different) searches happen for FTS and paragraph highlighting.
  1. Using AND implies you expect documents containing both (or all) keywords. That's fine, that's how FTS currently works - you get documents containing both keywords in the complete document.
  2. When showing paragraphs within the document, we show all paragraphs containing any keyword. Initially this was showing only paragraphs containing both keywords but after your feedback for issue #1136 this was modified to show "any".
Please, let us know what it the desired behavior on your end because expected result for #1136 and the issue reported by Darsh are in conflict.

If you prefer only paragraphs containing all keywords, then we have to modify how how search works - the main search should be in the paragraphs index then, followed by getting documents where matching paragraphs are found, not vice versa.
Morgan Maguire, CEO
Hello Radomir Mladenovic, Contegra Radomir , Harsh Parikh, Tech Lead at DevIT Harsh and Darsh Shah, DevIT Darsh ,

Here is a video explaining how the highlights should work depending on whether the user is performing a search with "All Words" or "Boolean" when "and" is included as a search term. If there is any further confusion on these requirements, please refer to how the search results are produced in the legacy application:



Thanks,

Morgan
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh @Darsh, I think the highlight of paragraphs currently behaves as in Morgan's video - any keyword of an AND boolean expression is highlighted in a paragraph (in the video that's a page), not only in paragraphs that have both keywords. If I'm missing something, please let me know.
Radomir Mladenovic, Contegra
As of highlighting for "like", I tested it and and it works. (I generated index for stage using an empty stopwords list.)
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We have removed noise word from noise.dat from TologixDBIndexer Project and then re-indexing.

Is there any other noise list which we need to remove ?
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , the indexer is using noise.dat file from the folder where the indexer exe is. That's all.
After changing the stopwords list you have to delete the old index and re-index because the noise words are copied to the index folder when index is created. In an existing index you can see used stopwords in index_n.ix file.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We did Same thing. In following path of server we have latest Indexing for all modules. Could you please check it that is it OK or Not ?

Server : 10.68.138.10

Indexing Path :   E:\DevContegraISLGRebuildStagingIndexes
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , FTS index looks OK.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , you can find updated indexer and search in folder 2021-02-27 on OneDrive - https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=gfh2JQ

In order to workaround the performance issue I had to create an additional index as a part of FTS indexing. On my system, search for "tribunal" is about 10x faster now.

To specify path of this new index use property "FieldsIndexDir" of the indexer config file for FTS.I sent you a sample config file as well.

The SearchController has been updated as well. You need to add "FullTextFieldsIndex" with path to the new index to the application.conf

Let me know how this worked for you.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We are going to do changes as per your above instruction. But in Web Search Project after copied SearchController file the following error is given .


Please suggest.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The Indexer is also not cratering any indexes and given following error after copied TologixDBIndexer.exe file.

System.InvalidOperationException: Nullable object must have a value.
   at System.ThrowHelper.ThrowInvalidOperationException(ExceptionResource resource)
   at System.Nullable`1.get_Value()
   at DBIndexer.CmdLineIndexer.runMainIndexer(DocFieldsDataSource fds, ParagraphDataSource pds) in p:\contegra\contegra-tologix\DBIndexer\CmdLineIndexer.cs:line 199
   at DBIndexer.CmdLineIndexer.run() in p:\contegra\contegra-tologix\DBIndexer\CmdLineIndexer.cs:line 123
   at DBIndexer.MainForm.Main(String[] args) in p:\contegra\contegra-tologix\DBIndexer\MainForm.cs:line 181
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , sorry, I though it was obvious from the context that that field is just one more string. I'm attaching the full file.
I'm checking indexer and will be back to you in a minute.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh   you can get updated indexer from the 2021-03-01 folder on OneDrive.
Harsh Parikh, Tech Lead at DevIT
OK Thanks Radomir Mladenovic, Contegra Radomir . Will check and let you know if we face any issue.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh , path for the "FieldsIndexDir" is not properly encoded in your config file. The back-slash requires two "\\". See how it's done for other properties:
Harsh Parikh, Tech Lead at DevIT
Thanks Radomir Mladenovic, Contegra Radomir . I have missed that
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The FTS indexing is stuck after 2 RowId. It is stuck from last 15 minute. Here, I have attached log file.

Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , that's strange. I generated index for stage on Saturday without any issues. (You can even find my indexes somewhere under C:\temp I believe, I cannot login to the VPN at the moment.)
To troubleshoot, try enabling DEBUG logging level, re-start indexing and see what's logged.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

I have again started the Indexing but still after 2 RowID. It is stuck. How I will enable Debug mode ?

We are currently try to indexing in our local environment. but database is same as ISLGRebuildStaging.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

After updating PDFHighlighter URL, still FTS indexing is stuck after ROWID 2. Please suggest how we will resolve this issue ?
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , did you try enabling the DEBUG logging level? In the TologixDBIndexer.exereplace INFO with DEBUG:

    <root>
      <level value="DEBUG" />
      <appender-ref ref="ConsoleAppender" />
      <appender-ref ref="FileAppender" />
    </root>

BTW, is there a problem with the VPN currently? I wanted to try re-indexing on the 10.68.138.10 server but I cannot connect although I'm on the VPN. Any idea?
Harsh Parikh, Tech Lead at DevIT
No Radomir Mladenovic, Contegra Radomir . I am able to connect VPN on 10.68.138.10.

Also, I am not able to find TologixDBIndexer.exereplace in TologixDBIndexer Project. Could you please guide us how to enable debug mode ?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Can we connect quickly on Skype to resolve this FTS indexing issue in our local environment ?
Radomir Mladenovic, Contegra
The VPN issue was something on my end. It works after restarting the system.

I'm trying to run indexing again - currently waiting for your proc to give results and start indexing. I'll get back to you soon.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh for me indexing works on the server:
2021-03-02 03:02:54,095 [1] DEBUG - Indexing file E:\ISLGRebuildStaging\wwwroot\Documents\HTMLFiles\ARB-0011 - ICSID Institution Rules (1984).html
2021-03-02 03:02:54,235 [1] DEBUG - Complete DocFields: rowid	3	contenttypedatamasterid	10697	documentcontenttypeid	13	documentid	3	documentcontenttypename	Arbitration Rules	pdffilename	ARB-0011 - ICSID Institution Rules (1984).pdf	htmlfilename	ARB-0011 - ICSID Institution Rules (1984).html	fullcitation	ICSID Institution Rules (1984)	fullcitationtext	ICSID Institution Rules (1984)	language	English	validfrom	26 September 1984	validto	31 December 2002	validtopresent	false	disputeid	0	proceedingstageid	0	proceedingstageorder	0	issubsequentdevelopments	false	refjuriscount	0	issuingorganization	International Centre for Settlement of Investment Disputes	ispdfonly	false	isuploadskipped	false	treatytypeid	0	sortingdate	19840926	field_22	ICSID Institution Rules (1984)	field_23	ARB/0011	field_24	19840926	field_25	20021231	field_26	false	field_27	368	field_32	29	internal_file_id	0278A68311C95612114AB0EC4E1CA84D	internal_rec_id	BE326C0A9485CE198199B791AC196736
2021-03-02 03:02:54,235 [1] DEBUG - Getting properties for db row 3 in table FE_MetafieldwithValueDynamicFTS
2021-03-02 03:02:54,235 [1] INFO  - Index doc db://FE_MetafieldwithValueDynamicFTS#RowId=4
2021-03-02 03:02:54,235 [1] DEBUG - Handling column: RowId = 4
2021-03-02 03:02:54,235 [1] DEBUG - Handling column: ContentTypeDataMasterId = 10698

Please send my your config file. I'd like to check it and try it.

I don't think that having a skype now is productive considering we have to wait almost 30 minutes for the stored procedure to give results. Can you reproduce the issue indexing dev database? Or maybe you can isolate the issue by checking what row #2 is and making a proc that returns data set onlt with that row?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Following, I have attached my config file of FTS.



I am going to Pull ISLGRebuildStaging database from server (10.68.138.11) in my local system and again try to re-indexing. If we still get same issue then will let you know.
Radomir Mladenovic, Contegra
I cannot connect to your database so not able to reproduce this.
Let me know how it worked for you indexing stage.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The Indexing issue is still there. We have download whole ISLGRebuildStaging database from server and hosted in our local environment but still after 2 Row the Indexing is stuck.

Is there we missed any thing in TologixDBIndexer project ?

Can we take call and sort out this issue ? It is critical for us to check FTS module with Indexing.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I have another call at 12:00 my time. I'll ping you now on skype and let's see what we can do  by then...
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

I have copied your DBIndexer3 Project from Server and replaced my indexer config but now we are getting following error.

Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The Old Indexer is working fine for FTS. there is might be something with new Indexer exe which you provided on 1st March.

Please look into this and let us know how we will resolve this issue to generate Indexes for FTS module.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , there's nothing special in the new indexer except it starts one more process threads for the third index. Could it be some limitation of the system you're using? Can you try indexing on the server as there it worked fine for me.
I'll add more logging to the indexer so that we can figure out where exactly stops for you.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I prepared the indexer update with additional logging. Please take the update from 2021-03-02 folder on OneDrive, run it and send me the log. (You can stop it as soon as you notice that it hanged.)
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We have updated TologixDBIndexer.exe file and tried to generate FTS indexes and following log is started.

Please confirm that is it Ok or not.

Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh you see this error in the log?

I believe this is why it doesn't work for you. Check the path for the fields index in your config file - I guess you didn't put double back-slash in the folder path.
Harsh Parikh, Tech Lead at DevIT
Yes true Radomir Mladenovic, Contegra Radomir .. We have updated double-slash and now again try to re-indexing.

We will let you know if we still face an issue.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The FTS indexing is created now. but we haven't seen to improve any performance issue. When we search with "Tribunal" word it is still taking so much time to get the response from API.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh did you copy all 3 indexes? did you set up the new fields index in the web app config?
Harsh Parikh, Tech Lead at DevIT
Yes Radomir Mladenovic, Contegra Radomir . We have created new indexes which created in all 3 FTS folders. Also, we set up new  fields index in web app confing. Here, I have attached my appseeting.json file.

Radomir Mladenovic, Contegra
 That doesn't make sense. Did you copy the SearchController changes?
Harsh Parikh, Tech Lead at DevIT
Yes Radomir Mladenovic, Contegra Radomir ..I have also copied SearchController as well.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh run the server in debug mode, put a breakpoint on line 151 (where the return from the method is), run search, when it stops at the breakpoint send me values of variables time1, time2, ... time5.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Following are values. 

time 1 : 5308
time 2 : 1895
time 3 : 6157
time 4 : 10232
time 5 : 3566
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

I hosted published code in different IIS and it seems fine. Let us know once you will be completed the FTS changes as per discussion on Monday (1st March) 
Morgan Maguire, CEO
Hi everyone,

Further to our discussions on the searches in Full Text Search and Subject Navigator during my meeting with DevIT and Industrial earlier today and to give Rob Wiesenberg, Contegra Rob and Radomir Mladenovic, Contegra Radomir further context to the problems we're experiencing, I created the video below outlining the issues I've discovered within these searches:
  1. Fuzzy typo is working as expected (not an issue) in both the Subject Navigator and Full Text Search.
    • Naomi Joanis, UX Team Lead at Industrial Naomi , note that the fuzzy typo function is proportionate the number of characters searched. Therefore, for a term like "abose" it may be too few characters to allow a fuzzy type setting of "1" to compensate and produce results for "abuse".
  2. The speed of both searches is slower than expected (and there should be a loading indicator while the searches are loaded).
  3. The Subject Navigator search results are incomplete.
  4. Boolean operators are not working within the Full Text Search.
  5. There is no way to access PDF results within the Full Text Search.
  6. Harsh Parikh, Tech Lead at DevIT Harsh has indicated that the indexes are extremely large (30GB+).
I'm going to have a call with Rob Wiesenberg, Contegra Rob this afternoon to discuss these issue, and if required we'll setup a call tomorrow at 8:15am Vancouver time to discuss these issues as group.

Radomir Mladenovic, Contegra
2. What are the specs of the system running the staging application?

3. I can investigate this. Harsh Parikh, Tech Lead at DevIT Harsh please send me configuration file (or parameters needed) to index Subject Navigator data in staging.

4. I know why - after it was said to get paragraphs containing any keyword in query, I changed search type of the query for paragraphs to "any keyword". (The man search for the documents is still running as boolean search.) This can be fixed but, if we need to change search to match all keywords in the paragraph, then the complete FTS search needs to be reworked so no point in fixing it.

5. I will investigate this.

6. It's a tricky one. Indexes are huge because in the created index we cache the complete document text and the original file. This is needed by dtSearch to do highlighting faster, although I think it's really needed only for PDF highlighting. 
It's possible to make this PDF highlighting work without storing the original file content in index, but at the cost of the highlighting performance.
An another issue with removing original files from the index is that we'd have to create a separate index only for PDFs, using full document path. This is because the current organization of data allows the same file to appear in multiple indexed documents - I hit this issue at the beginning of FTS implementation and had to make changes to accommodate data in database.
In short, we can bring the index size down by creating one more index.
Morgan Maguire, CEO
Hi everyone,

I just had a call with Rob Wiesenberg, Contegra Rob . He is going to connect with Radomir Mladenovic, Contegra Radomir tomorrow to work through solutions to the issues above, come back with a plan to tackle each issue and connect with Harsh Parikh, Tech Lead at DevIT Harsh and the team as required.

In the interim, Harsh Parikh, Tech Lead at DevIT Harsh could you please provide Radomir Mladenovic, Contegra Radomir with the configuration file for the Subject Navigator index so that Radomir Mladenovic, Contegra Radomir can examine the issues with missing results in the search.

Thanks,

Morgan
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I don't need the config for the Subject Navigator any more. I generated the index using the same stage db and getting the same number of total results for the "good faith" search. 
Now, to troubleshoot this issue, I need from you:
  • The payload of the search request you're sending.
  • At least one row number of the stored procedure data in which requested data appears but you're not getting it in the search results. For example, a row for one of those items that should appear in the A or B section mentioned by Morgan.
Thanks,
Radomir
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh please also provide a sample of search that returns PDF without pages. Using FTS index I generated for stage, I run thefloowing search for "tribunal", limited to ispdfonly=true:

{
    "searchRequest": "tribunal",
    "FilterStatement": {
        "type": "boolean",
        "Operator": "and",
        "clauses": [
            {
                "type": "match",
                "field": "ispdfonly",
                "values": [
                    "true"
                ]
            }
        ]
    },
    "SearchType": "3",
    "Stemming": false,
    "Synonyms": false,
    "Fuzzy": false,
    "Fuzziness": "1",
    "SortField": "hits",
    "SortOrder": "desc",
    "PageNum": 0,
    "PageSize": "20"
}
and all results haveparagraphs (pages in case of pdf) returned - you can see this by line numbers:

Did you have PDF documents on your system when you generated the index? Any errors in your FTS indexing log?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Have you updated Indexer for FTS ? 
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh no, I used the save indexer that I sent to you.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We checked with tribunal and nationality words and it works fine. We get only those documents which have pinpoint references.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We are looking in to Subject Navigator SQL View and will check again (2nd Point).

Let us know once you completed rest of points.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I'm not quite sure your message about what you checked so far. From my point of view, there's only #4 that needs to be addressed, and I'll look into #6. 
If you see other issues, please provide details as I asked you above so that I can cross-reference data and index created.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Can we take call for Subject Navigator result ? 
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I'm not available for a call today. Please,provide me with data I asked for and I'll take a look at it.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

What you need from my side. if I am going to search with good faith then result is not generated properly.

Please let us know what I need to provide you.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh   I'm confident that the indexer is working correctly for data it got. If you don't see the results expected, please show me that the missing data is returned by your stored procedure in the first place. I cannot troubleshoot this if I don't know what we're looking for.

So, as asked above, I need from you:
  • The payload of the search request you're sending.
  • At least one row number of the stored procedure data in which requested data appears but you're not getting it in the search results. For example, a row for one of those items that should appear in the A or B section mentioned by Morgan.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Please see following screenshot. If search with good faith then we need following row.

You can find this data on ISLGRebuildStaging databse. We have just updated the SQL View.

Id : 24338
BranchID : 17559
ParentId : 5579
Branch Name : See "Good faith"
HierarchicleParentIds : 1,5579,17559,4202,4661,5207,5579,5580,5583,5585,5597,5606,5657,5661,5686,5688,5693,5695,5708,5724,7523,7826,8002,8142,8258,8594,9588,9684,10431,11241,11258,11685,12649,12959,13084,13181,13666,13693,14749,15532,15562,16154,17381,17559,19044,19092,20332

SelectedNodes : 1,5579,17559
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

I also noticed that there is no any Branch good faith under B then why your JSON response provide B branch.

There is no any associated detail with good faith word under B branch.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I don't know which info points to "B" section, it's your data and I don't know how to interpret. I really need more info in tracking this down. Make sure the data is in the data set indexed (e.g. row number), send me your request payload and what you expected to get back but not received. Which field should I be looking at?
I just indexed SubjectNav after you updated the proc and will check the previous thing you sent me.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

It is Parent - Child Branch Structure in Subject Navigator. The good faith word under branch name A > Abuse of process > "see good faith". But your response is not providing A branch nodes.

It is very complicate to provide you the result of all branches but at first point we need this nodes from response.

Can we take short call quickly so i can provide you detail ?

Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I have some time atm, I'll give you a call shortly.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , are you using pagination with the Subject Navigator search? If you don't pull all results then it's normal that you don't get all nodes you're expecting. When I get all results, I see node "1" in the SelectedNodes.
However, I see an issue with this search because nodes in the SelectedNodes are not unique and values repeat a lot. I'll change that.
Another issue I see with this search is that, when you pull all results, the search is slow because all results are being highlighted. I think we should also use progressive highlighting for this. I'll add some options for this.
Harsh Parikh, Tech Lead at DevIT
Ok Radomir Mladenovic, Contegra Radomir . Please provide updated solution. we will availbel tomorrow between 11:00 AM to 3:00 PM IST. so will check and let you know.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We are not using Paging in SN module. we need to get all data.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I'm sending you the search service updates for the Subjects Navigator search. The files are in the 2021-03-06 folder on OneDrive.

1) My understanding is that for the Subjects Navigator you need all results immediately. I believe you were not getting some nodes because you were taking the first page only and the nodes are not in order.
The "subject-nav" search method ha been modified to return all results, but without highlighted fields. That should allow you to build the complete page with all the content.

One more new thing is that in the response you can find "timeLog" array with a log of time spent in different stages. For example, executing Subjects Navigator search for "good faith" on my system takes 620ms:


I would suggest logging "timeLog" to the browser console. Then you can easily review this when testing.

2) Next, to get highlighted fields for the Subjects Navigator search, you need to use "highlight-subject-nav" service method. The request payload is the same as for the "subject-nav", with addition of "FieldFilterName" and "FieldFilterValues" fields that limit results.
For example, after adding the results (from "subject-nav") to the page, you could collect the "id" field values of all results visible in the viewport, and possibly maybe for another more page, and send request to "highlight-subject-nav":

{
    "searchRequest": "good faith",
    "SearchType": "AllWords",
    "Stemming": false,
    "Synonyms": false,
    "Fuzzy": false,
    "Fuzziness": "2",
    "FieldFilterName": "id",
    "FieldFilterValues": [18379, 5602, 3082]
}
That would return you results only for those 3 nodes (where id is 18379, 5602 or 3082), with "highlightedFields" included. Take the highlighted fields and update the page.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Following bugs are found by Industrial team in Subject navigator module. Could you please check and provide update :

Bug no : 22141 [staging] any words > search doesn't disregard words such as "and" or "or"

Steps to reproduce:
  • Set search type to "any words"
  • type a search like "abuse and ego"


Result:The results highlight the word "and" but not abuse or ego

Expected:Words such as "and" or "or" will be ignored in the any words search type, and abuse or ego would be found as matches


Bug no : 22160 [Staging] Stemming > Not working

Steps to reproduce:
  • Search options > check the box for stemming
  • Type in the word "like"
  • view results
  • Reset search
  • Type in the word "likelihood"
  • View results


Result:The results for "likelihood" don't appear when searching for "like"

Expected:If stemming is selected, I should see results for "likelihood" when searching for like


bug No. 22162 :
[Staging] Synonyms > Not working

Steps to reproduce:
  • Type in "bias" in search field
  • Check the box for "synonym"
  • Enter search
  • View results
  • Search "discrimination"
  • View results


Result:No synonyms appear for that word, only bias is shown

Expected:Will see highlighted results for synonyms


Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh regarding the missing "A" node, it happens because we're hitting dtSearch query size limit (70k characters) pulling hierarchy IDs. I'll refactor this and provide you with an update later today.
However, related to change we made in code during our skype call, I'd suggest you to review on your side do you neednodes from both "SelectedNodes" and "hierarchicalparentids". It looks like that the later one pulls in much more nodes.
Harsh Parikh, Tech Lead at DevIT
Yes Radomir Mladenovic, Contegra Radomir .. We checked and we need all hierarchicalparentids. Because within Search we also render other branches as well to user navigate. so we need all hierarchicalparentids branches result.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh

Bug no : 22141 - That's how dtSearch works. You put a keyword into the query and asked for any/all words. If you don't want it, don't put it into the query box. Or maybe you can add it as a stopword.

Bug no : 22160 - To my knowledge, "like" is not a stem of "likelihood" - at least not as a default stem in search engines. Where did you get this example? Does this work like that in the legacy application? If it does, you probably have custom stemming rules file (stemming.dat) so please send it to me for a review and inclusing into the project.

Bug No. 22162 - Similar to the previous, do you use thesaur.xml in the legacy application? Or maybe use WordNet synonyms? If you do, you should be able to find these on the legacy server. Check https://support.dtsearch.com/dts0190.htm for info in which folders this might be found.
Harsh Parikh, Tech Lead at DevIT
Hi Naomi Joanis, UX Team Lead at Industrial Naomi ,

Please see above comment of Radomir Mladenovic, Contegra Radomir  and suggest or guide him regarding Subject Navigator bugs which you produced..
Naomi Joanis, UX Team Lead at Industrial
Hi Harsh Parikh, Tech Lead at DevIT Harsh and Radomir Mladenovic, Contegra Radomir

In testing the Subject Navigator search, I am using the description of the search types from the legacy app and comparing results to the legacy app. 

  • #22141  – the description of "Any Words" search type is:
     
    "An "any words" search request consists of an unstructured natural language or "plain English" query. In a natural language search request, words such as AND and OR are disregarded. Use quotation marks to indicate a phrase, + (plus) to indicate a word that must be present, and - (minus) to indicate a word that must not be present."

    and when I search "abuse and ego" on the legacy application, I get results for "abuse" and "ego and not the word "and". This is not occurring on the current application where I am only seeing results for the word "and" and not results for either "abuse" or "ego".

  • #22160 – I should clarify that even with stemming checked off I am not seeing any stem matches found. For example, searching "like" on the legacy application shows me results for "likely" and "likeness". I am only seeing matches for "like" on the current application even when stemming is checked off.
  • #22162 – Harsh Parikh, Tech Lead at DevIT Harsh ,  this seems like a question for you in terms of how we are matching synonym terms.

Please let me know if any of the tests I've performed above aren't accurate or should be modified, however, going by the results presented in both applications there do seem to be some issues related to the search options. 

Thanks, 

Naomi 
Radomir Mladenovic, Contegra
Naomi Joanis, UX Team Lead at Industrial Naomi  

  • #22141 - It's not that only "and" is being highlighted here. As the current version on test is not getting the complete list of results, you're seeing only the most "relevant" - the "and" word appears more often results with it got to the top of the results list. If you remove "and" from your query, I'm sure you'll see the other two keywords highlighted.
    But, from your description of the legacy behavior, looks like stopwords are used there. Harsh Parikh, Tech Lead at DevIT Harsh can you check the legacy subjects navigator index for the noise word file (index_n)?
  • #22160 Looks good on my end. With stepping enabled, I see "likeness" found when "like" is used:
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , in the 2021-03-07 folder you can find updated search service and indexer.

The indexer should fix the issue with FTS where keywords were matched in the document filename when searching for paragraphs. The results were showing all paragraphs of the document in that case although there were no text patches. You need to re-build FTS index.
Update the FTS indexer config with option "CacheText": false. That will prevent caching of the original documents in the index and significantly lower the index size. During the next week I'll prepare an update that approaches PDF highlighting in a more lightweight way. As you don't have PDF highlighting integrated for now, it doesn't affect you.

Next, the search service have been updated to return all results and nodes for the Subjects Navigator search. However, I'd recommend checking again if you need all nodes from the "SelectedNodes" and "hierarchicalparentids" fields. For example, search for "good faith" has 734 matches and returns more than 8500 nodes. For example, I don't understand why node "A", which suppose to be at the top level, references other nodes:


On my system this search takes about 3.5 seconds to execute and I cannot make it faster as we're fetching thousands of dtSearch documents. Running the search and getting these documents is about 0.2s, but getting the referenced nodes adds 3s more:

The response payload from the search service is more than 17 MB in size. That's a huge response which will take extra time to process on your side as well.

The update also addresses highlighting for boolean expressions for FTS, but I'd recommend also building the new FTS index using the updated indexer before testing this.

P.S. Tomorrow (Monday) I'll be off. 
Harsh Parikh, Tech Lead at DevIT
Hi Morgan Maguire, CEO Morgan   and Radomir Mladenovic, Contegra Radomir ,

As per above comment for FTS module, Radomir Mladenovic, Contegra Radomir , will provide update on this by next week.

Morgan Maguire, CEO Morgan , As per above comment of Radomir Mladenovic, Contegra Radomir , We are rending all the child branches of that particular Parent branch and it will take 15 second to loads when search with "good faith" word.

As per my suggestion, We need to change the logic and render all the matching keywords parent and child branches only rather than all branches.

Radomir Mladenovic, Contegra Radomir , Currently, The word is not highlighting with matching keyword as you changed the logic. But, We need highlight word as per previous logic like we need HighlightedFields within result of sub-nav. As we can't implement new logic at this time.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh , you can get highlighted fields if you use "highlight-subject-nav" service (the message from March 6th). It's simple to re-enable fields highlighting on the basic search service but that will increase search response time from 3 seconds to 16 or even more, I don't remember exactly.
If you're sure you want to do this, in the SearchController line 249 replace with:
var r = CreateResponse(sm, true, indexes, true, true);
Just let me know if you do it so that I can update it to my side as well.

Morgan Maguire, CEO
Ok. Thanks for the update Harsh Parikh, Tech Lead at DevIT Harsh .

Ketan Sondarva, Technical Project Manager at DevIT Ketan , can you please ensure you discuss this with Harsh Parikh, Tech Lead at DevIT Harsh (and Radomir Mladenovic, Contegra Radomir and Rob Wiesenberg, Contegra Rob  if required) so that we can determine when all dtSearch related features will be complete. Ideally we should have this project wrapped up at the beginning of next week so that Melissa Cowell, General Manager at Industrial Melissa and Naomi Joanis, UX Team Lead at Industrial Naomi have an opportunity to perform further testing on the tool before launch on April 1st.

Thanks,

Morgan
Harsh Parikh, Tech Lead at DevIT
Hi Morgan Maguire, CEO Morgan

Please confirm that we will render only matching keyword parent and child branches in SN module. so Radomir Mladenovic, Contegra Radomir can change the logic and provide update to us.

Please confirm.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Once Morgan Confirms that only matching branches result we need then we need to change the logic only Selectednodes result we need with highlighted word itself.

Morgan Maguire, CEO Morgan , Please confirm.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh please note that the most time consuming part of handling the subjects navigator search request is highlighting. If you get 700+ results, getting them all highlighted right away takes some time. As I already suggested, I think it's better to use non-highlighted results to build the results page quickly, then use "highlight-subject-nav" to get progressively highlighted fields for a subset of nodes (displayed on page), and update the page elements.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

As We discussed, my team fully occupied with other application pending stuff which we need to complete before 25th March. so now we can not change anything with current build. Please do something that we can get only selectednodes result with highlighted word it self.

It means we need take selctednode result by using that selectednode, we need all its branchid results in response with highlighted word.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh can you please clarify "get only selectednodes result"? You want to remove "hierarchicalparentids" and enable highlighting as it was before?
Harsh Parikh, Tech Lead at DevIT
Yes Radomir Mladenovic, Contegra Radomir .. We need get only selectednodes all branch id result.

For example, if you search with "abuse of process" then get all rows selectednodid and by using selectednodeid provide all those branchid result. 

Like, selectednodedid are (1,5,8,11,15,16) then we need each branchid result of (1,5,11,15,16). It means we need to render only matching keyword Parent and child branches with enable highlighting word.

But, First of all we need to take confirmation from Morgan. so please wait for Morgan 's reply first.
Morgan Maguire, CEO
Hi Harsh Parikh, Tech Lead at DevIT Harsh ​,

I'm not following what you're requesting me to approve. Please provide me with a concrete example of how this affects the display of the results.

Also, it sounds like we're pushing UI work onto Radomir Mladenovic, Contegra Radomir ​. Please ensure that we're not using his services for work that should be done by your team. Radomir Mladenovic, Contegra Radomir ​was engaged to build the customized indexes and advise your team on implementation, so let's ensure we're not expanding that scope.

Thanks,

Morgan
Harsh Parikh, Tech Lead at DevIT
Hi Morgan Maguire, CEO Morgan ,

We would prefer to take call with Radomir Mladenovic, Contegra Radomir for SN & DD module search along with Melisa as want to fix the requirements which we can quickly (within 1 to 2 days) integrate into system.

Let's schedule a call tomorrow 10th March 8:15 AM Vancouver time. so please co-ordinate with Radomir Mladenovic, Contegra Radomir  and schedule Meeting. 
Morgan Maguire, CEO
Ok. That's fine Harsh Parikh, Tech Lead at DevIT Harsh ​. I'll setup the call. However, I have a very full schedule this week, and I'll need to limit the call to 30 minutes. So please send any details in advance of the call so that we can get right to the issues.

Thanks,

Morgan
Morgan Maguire, CEO
I've sent out a calendar invite for a call tomorrow at 8:15am Vancouver time between Harsh Parikh, Tech Lead at DevIT Harsh , Ketan Sondarva, Technical Project Manager at DevIT Ketan , Melissa Cowell, General Manager at Industrial Melissa , Radomir Mladenovic, Contegra Radomir and myself. Let me know anyone else needs to join and I can add them to the calendar invite.

Thanks,

Morgan
Rob Wiesenberg, Contegra
Morgan Maguire, CEO Morgan  please send me an invitation as well. Thanks. 
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I updated the search service for the discussed subjects navigator search changes: disabled highlighting and collecting only "selectednodes". You can find it in the 2021-03-10 folder.
I'll have a new disputes search method for you later tonight or tomorrow morning.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I updated the search controller in the 2021-03-10 folder with a new method: "disputes-details"
You need to send it the same request (with the query and filters) as to the "disputes", extended with the additional filtering fields:
    "FieldFilterName": "disputeid",
    "FieldFilterValues": [12877]
I hope this does what you needed it to do. Let me know if you have any question.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The subject Navigator is OK now. But, For Dispute Document module if we pass only English language then it is taking so much time to give response back and also as discussed we don't want highlight the word in Dispute Document module also.

Following is my search request when I filter with only English language.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["234"]}]},"SortField":"FullCitationText","SortOrder":"asc"}


Please look into this and let us know.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Another thing like we send following request we are not able to get any result.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["235"]},{"type":"match","field":"Field_109","values":["377"]}]},"SortField":"FullCitationText","SortOrder":"asc"}
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We have implemented the new method : "disputes-details". But, I have seen that you provided result from first stored proc.

We need result from second strode proc by passing Dispute Id SelectDisputeContegraSearch.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I guess I misunderstood what you need for the dispute-details. To return results from the second stored procedure, I'll need to make changes to the indexer and create a separate index for the second procedure data.
As of the dispute search performance, I guess it's because you're searching without a keyword so you're getting a long list of results. (I believe the highlighting is already off.) Can you please send me the timeLog from the response. It should show the actual time and number of records collected.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Please don't change anything. You don't need to create seprate index by using following code in your SearchController for disputes-details method is fine. We are passing same request (with the query and filters) as to the "disputes", extended with the additional filtering fields:

  [HttpPost("disputes-details")]
        public IActionResult SearchDisputesDetails([FromBody] SearchModel sm)
        {
            Stopwatch stopwatch = new Stopwatch();
            stopwatch.Start();
           

            List<string> indexes = new List<string>() { Settings.Tologix.DisputesIndex };
            ApiError err = Search(sm, indexes, false);
            if (err != null)
            {
                return new ObjectResult(err);
            }
            var searchTime = stopwatch.ElapsedMilliseconds;
            bool highlight = false; // sm.SearchRequest != "xlastword" && !string.IsNullOrWhiteSpace(sm.SearchRequest);
            var r = CreateResponse(sm, false, indexes, highlight, false);
            r.TimeLog.Insert(0, "search: " + searchTime);

            Stopwatch stopwatch2 = new Stopwatch();
            stopwatch2.Start();

            // get distinct DisputeId values in results
            WordListBuilder wordListBuilder = new WordListBuilder();
            wordListBuilder.OpenIndex(Settings.Tologix.DisputesIndex, indexCache);
            wordListBuilder.SetFilter(sm.ResultsAsFilter);
            int values = wordListBuilder.ListFieldValues("DisputeId", "*", 10000);
            log.LogDebug("Found " + values + " disputes (wordListBuilder.Count = " + wordListBuilder.Count + ")");
            List<string> disputeIds = new List<string>();

            for (int i = 0; i < wordListBuilder.Count; ++i)
            {
                String word = wordListBuilder.GetNthWord(i);
                int docCount = wordListBuilder.GetNthWordDocCount(i);
                disputeIds.Add(word)
;
                //log.LogDebug("- " + word + " " + docCount);
            }
            List<string> disputewithcontentypedatamaster = new List<string>();
            for (int j = 0; j < disputeIds.Count; j++)
            {
                disputewithcontentypedatamaster.AddRange(disputeIds[j].Split(','));
            }
            // get all documents for found disputes
            if (indexes != null && disputewithcontentypedatamaster.Count > 0)
            {
                var disputesResults = new List<ResultDocument>();
                FindByField("ContentTypeDataMasterId", disputewithcontentypedatamaster, new List<string>() { Settings.Tologix.DisputeDocsIndex },
                    disputeRes =>
                    {
                        for (int i = 0; i < disputeRes.Count; ++i)
                        {
                            disputeRes.GetNthDoc(i);
                            disputesResults.Add(createResultDocument(disputeRes.CurrentItem));
                        }
                    });
                r.Results = disputesResults;
            }
            r.TimeLog.Add("collect nodes for disputeId (" + r.Results.Count + "): " + stopwatch2.ElapsedMilliseconds);

            stopwatch.Stop();
            r.TimeLog.Add("total: " + stopwatch.ElapsedMilliseconds);

            return Ok(r);
        }


And I think it works OK. But when we search with only language then it takes long time and still highlighting is on.

Can we take one small call so we both remain on same page.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

If We search with only English language the following JSON we are pasing.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["234"]}]}}

The Postman tool  reach out maximum time when we pass above search request.

Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Sorting is also not working after getting result in Dispute Document module.  We are passing sorting filed and type in json request but result is not filtered.

IF I want to filter FullCitation by descending order then we are passing following json request.

Sort Filed : FullCitation
Sort Order : desc

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_109","values":["403"]}]},"SortField":"FullCitationText","SortOrder":"desc"} 
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Have you gone through above Dispute & Document module queries ?  And, Also Could you let us know that  FTS module is done from your side ?

As, We are planning to deploy WebAPI and Indexer on server by early next week.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I was in travel most part of the day yesterday so didn't have time to check the Disputes search. Will do it today.
As of the FTS, I'm not aware of any open issues so I consider it complete from my side.
Harsh Parikh, Tech Lead at DevIT
OK Thanks Radomir Mladenovic, Contegra Radomir for update. Let me know if you need anything for Dispute Document module.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , thanks, I copied your changes.

I'll try to address the issues in order you reported them:

1) Response taking too long for filter only search:

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["234"]}]},"SortField":"FullCitationText","SortOrder":"asc"}

You could increase timeout in Postman - on my system the request takes about 78 seconds. However, the problem with this query is that it gives way too many results - there are 231956 records returned for initially found 5783 matches! The response is over 200MB!

I think it doesn't make sense to return all results. With a a few different searches, someone could scrape your whole database. I think we should put a hard limit of say 1000 (or 10K) items and never return more than that. Better give an error to user to refine the query.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We need DisputeId first value only which we pass to second query for result. Suppose DisputeIds are (12877,1500,1501,1502) so you can fetch always first value of DisputeID column (12877) and pass to second query.

Because first we need to display only dispute node and then on click we call our second method (disputes-details) to get all data.

I am available for call to discuss. 
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I don't understand changes you made to the "disputes-details". You just copied the whole "disputes", there are no other changes. What's the point? If you can still use the "disputes"with just modified fields search, then let's do that and avoid code duplication.

I'm available for a call atm. Will call you in a few minutes.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I'm attaching modified controller with discussed changes for the "disputeId" handling in the first and the second disputes search method.

As mentioned in the call, sorting is now broken because the sort is applied to the first dtSearch search call when we collect the "disputeId". In the rest of the method we collect those dt documents having the disputeId. However, as we're hitting dt search query size limit (70K, because we need to enumerate all the different disputeIds) this results fetching is run in batches so the sorting cannot be applied in dtsearch.

I think I saw that you have only a couple of different fields here for which you use sorting, both referenced with SortField. Is that correct? In that case I can implement sorting in the controller, after fetching all the results. I cannot finish it today but hopefully you can have this tomorrow afternoon.
Harsh Parikh, Tech Lead at DevIT
OK Thanks Radomir Mladenovic, Contegra Radomir .. I hope the Performance issue is resolved when you pass only English language. Please correct me.

And, Let us know once you completed Sorting through Controller and provide update to us once you complleted.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

I have replaced the Search Controller and checked with pass only English Language. But, It is still taking time for getting response.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh , yes, I saw that. But that's expected. Still too many nodes. In fact, I went to debug I saw that for this case none of matched records didn't have more than 1 disputeId in the field so the change didn't affect this at all.
Harsh Parikh, Tech Lead at DevIT
OK Radomir Mladenovic, Contegra Radomir . If we make keyword search mandatory then it will resolve this issue. (it means user should have to entre keyword with any other filters).

What your suggestion. It will work ?
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I suggest to add some absolute limit anyway - e.g. 1000 items. If keyword is a mandatory and the user enters some very common word, we're back to the same problem.
Harsh Parikh, Tech Lead at DevIT
Radomir Mladenovic, Contegra Radomir , I am not getting about absolute limit. I assume you set limit up to 1000 items Right ?
Radomir Mladenovic, Contegra
Yes, I think we should stop somewhere. You cannot return 200K results to the user, as it happens now with your metadata example.
Harsh Parikh, Tech Lead at DevIT
Radomir Mladenovic, Contegra Radomir , We can not implement Paganization or lazy loading as of now before launch. so we need to find alternate solution
Radomir Mladenovic, Contegra
I see limiting results at some point as an alternative solution, which is very simple to implement.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh there are updated files in the 2021-03-12 folder, with the following changes:

1) Added sorting by field after collecting all results in Disputes search. (BTW,in your sorting example, the field should have been "FullCitation", instead of "FullCitationText".)

2) Added hard-stop limit for the number of results returned for the Disputes and Subjects Navigator search. It's optional but, if you want to enable it, add "ResultListStopCount" to the Tologix section:

{
  "SearchSettings": {
    "Tologix": {
      "ResultListStopCount": 10000,
      ...
Harsh Parikh, Tech Lead at DevIT
Thanks Radomir Mladenovic, Contegra Radomir . Will change the sort field name and set as FullCitation.

But, I am still confuse about hard-stop limit for the number of results. Could you explain more for this ?

What is the meaning of ResultListStopCount ?
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh , until you implement pagination, ResultListStopCount gives you an option to limit number of results returned - so that user cannot pull 200K results which will take an eternity to render anyway. I see it as a mean to prevent unnecessary load on the service. If you don't want to use it, it's fine as well.
When this limit is reached, the search service will stop collecting results, it will return what was collected until that point, and will set "HardStop" flag in the response. You could use this flag in your app to tell user that the search is too broad and to refine it.
Darsh Shah, DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

For the FTS module when we search with nationality word the following PDF's file paragraph 28 number we get in result.
But, when we click on 28 number, we are not able to find the text from PDF files.
We are facing this issue in other files also.
Following, I have attached one of the PDF and Video file. Please check and confirm.

Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I sent you a fix to the 2021-03-15 folder.
Note that the page you sent me is not page 28. This is a PDF document so the results are pages, not paragraphs. The page 28 contains "nationality":
Radomir Mladenovic, Contegra
Darsh Shah, DevIT Darsh sorry, the above message should have been addressed to you, not Harsh.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Many places we found that when we click on Page number it doesn't populate result so this issue resolved for generally or for this specific PDF file ?
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh the fix addresses similar cases as well.
Harsh Parikh, Tech Lead at DevIT
OK Thanks Radomir Mladenovic, Contegra Radomir  for quick reply. Will integrate and check and let you know.
Darsh Shah, DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

For FTS module when we search with "Like" word, On the result some documents display same paragraph multiple times.

Please check the attached video for more information.


Also, please find attached PDF and HTML files for your reference.

Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Today we have deployed latest version on server (10.68.138.10). but we are not able to find Paragraph number in FTS module.



Could you please look into urgently ?

Following are things you can check on server (10.68.138.10)

DBIndexerProjectPath : E:\DevContegraISLGRebuildStagingDBIndexer

Indexes :  E:\DevContegraISLGRebuildStagingIndexes

DocumentPath : E:\ISLGRebuildStaging\wwwroot\Documents
Morgan Maguire, CEO
Hi Rob Wiesenberg, Contegra Rob and Radomir Mladenovic, Contegra Radomir ,

Further to most recent results for the searches in Dispute & Dispute Documents: Re: Dispute Documents search field does not produce any results - TOLOGIX - ISLG App Rebuild, the Subject Navigator: Re: Problem with Subject Navigator search field, live site - TOLOGIX - ISLG App Rebuild and the issues with the FTS above, I am very concerned with the lack of progress in finding resolution on this project.

Could you please ensure that you connect with Harsh Parikh, Tech Lead at DevIT Harsh and Ketan Sondarva, Technical Project Manager at DevIT Ketan immediately to determine a solution to these problems.

Thanks,

Morgan
Rob Wiesenberg, Contegra
Morgan Maguire, CEO Morgan , yes we will continue to work with the team to get these resolved. 
Morgan Maguire, CEO
Ok. Thanks Rob Wiesenberg, Contegra Rob .

Morgan
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Today when we deployed TologixDBIndexer and WebAPI on server and we found that for FTS module indexing are created smaller size rather than our local environment.

As example, On server (10.68.138.10), in FTS Para folder the index_r_1 size is  61,000 KB and in our local it was created 4,63,030 KB. 

Is there an issue ? as we have deployed same thing which we used in our local environment.

The Subject Navigator and Dispute Document indexes created successfully and working as per our criteria but in FTS module we are not able to get Paragraph or Page number.

Please look into this as soon as possible as we need to release this module for UAT by tomorrow.
Naomi Joanis, UX Team Lead at Industrial
Hi Radomir Mladenovic, Contegra Radomir

Following up on some of the bugs I logged earlier, I'm still not sure if the search is working as expected. 

#22162 - Synonyms > Not Working
#22160 - Stemming > Not Working
#22141 – Any words > Search doesn't disregard words such as "and" or "or"
#22155 - All Words
Please let me know if anything is unclear with these issues, or if I'm performing invalid scenarios.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh , for me FTS paragraphs index size is 1.4 GB. I suspect that URL to the PDF Highlighter in your config file is no longer valid. You said it worked before so maybe something changes in the IIS config. You reference Highlighter using:

  "PdfHighlighterUrl": "http://10.68.138.10/highlighter/",

But I get 404 error page when I open http://10.68.138.10/highlighter/

As you're running this directly on the server, there's no need to go through IIS. Try using:

  "PdfHighlighterUrl": "http://10.68.138.10:8998",

Hope this helps. Let me know.
Radomir Mladenovic, Contegra
Hi Naomi Joanis, UX Team Lead at Industrial Naomi , Harsh Parikh, Tech Lead at DevIT Harsh ,

#22162 - I already commented this one on Mar 6 and asked you to check the thesaurus config on the legacy server so that we figure out which thesaurus you're using. I didn't get any feedback on this.
Or, if you have source code of the legacy search application that can be helpful as well in figuring out thesaurus options currently used.

#22160 - As before, looks good on my end. Testing this in the Subjects Navigator index, "like" gives 273 results with stemming off, and 392 results with stemming enabled. (My screenshot sent on Mar 6 also shows this working.)
Maybe the web application is not sending the stemming checkbox value to the search service properly.

#22141 - How many results are you pulling in the result page in the legacy application? The problem is the new application is trying to get all results at once. When I test it, I get more than 32000 results and the response is 70MB. And that's without stemming and fuzzy that you have enabled in your search.

#22155 - I'm not sure I understand this one. From the comment in your screenshot, I'd say you expect only those results containing all keywords in the branch name only. Is that correct? If true, that's a new requirement for me. We index all meta fields in order to support filtering but I don't think that dtSearch supports search requests limited to a single field (e.g. branch name). I'll check this.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Could you please provide sample of search request for steamming word in subject navigator module? so we can check our service request by tomorrow.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , sorry if I wasn't clear. You should use http://10.68.138.10:8998
I tried the one you're using (http://10.68.138.10:8998/highlighter) and that doesn't work properly and create much smaller index.

As of the stemming, the syntax is the same since introduced and you already have it many examples posted:
{
    "searchRequest": "education award",
    "SearchType": "AllWords",
    "Stemming": true,
    "Synonyms": false,
...
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh Naomi Joanis, UX Team Lead at Industrial Naomi please confirm how the Subjects Navigator search should work - related to my comment about issue #22155 above.
dtSearch does not support search requests limited to a single field using a search query and boolean/anywords/allwords options. We could workaround this by transforming the query to fielded search (https://support.dtsearch.com/webhelp/dtsearch/field_searching.htm) but I need a confirmation from your end as it's an extra development.
Harsh Parikh, Tech Lead at DevIT
Naomi Joanis, UX Team Lead at Industrial Naomi , Could you please provide reply to above radomir's comment for bug no. 22155 as you have requitements for those search result.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh , regarding the issue with paragraphs shown multiple times, I cannot reproduce it. The result I'm getting for the sample HTML you sent me shows only one paragraph:

and extracted paragraphs don't contain the other "copy" of the paragraph. Here are the all paragraphs extracted:



Please notice that in your video paragraphs are not exactly the same. Some have an extra dot, some have parenthesis, etc. I suspect that you didn't delete the old index after I provided updated indexer so paragraphs extracted by the new indexer were added on a top of what was already in the index (and the old paragraphs were not overwritten because paragraph ID format was modified).
Delete all three FTS indexes, index it and I think it will be fine.
Morgan Maguire, CEO
Hi Radomir Mladenovic, Contegra Radomir and Harsh Parikh, Tech Lead at DevIT Harsh ,

Re #22155, I don't think there is a problem running the search request across different fields; however, as Naomi Joanis, UX Team Lead at Industrial Naomi described in the user story, the results in the new ISLG do not include the result from legacy application: https://www.investorstatelawguide.com/ResearchTools/SubjectNavigator?toc=content&id=50&tab=r&search=education+award&searchType=all&stem=1&thes=&ftypo=1&fuzziness=1
Why isn't this branch included within the All Words search for "education award" within the new ISLG when it fits the parameters of the search? http://staging.investorstatelawguide.com/SubjectNavigator/Index?branchid=6UQPkRs5-Qc%3D
Thanks,

Morgan
Radomir Mladenovic, Contegra
Hi Morgan Maguire, CEO Morgan , sorry, I cannot tell from your screenshots what's different. I think I see the same branches. I don't have login for neither the current production nor staging to compare live websites.
When I run this search in my environment, using staging data, I think I see the same document:
Morgan Maguire, CEO
That's odd Radomir Mladenovic, Contegra Radomir . I've given you access to the subscriber side of the staging environment, you should have receive an automated email prompting you to activate your account. The subject navigator search is located here: http://staging.investorstatelawguide.com/SubjectNavigator/Index

Harsh Parikh, Tech Lead at DevIT Harsh , do you have any idea why the results on staging.islg are different from what Radomir Mladenovic, Contegra Radomir has shown above?

Morgan
Harsh Parikh, Tech Lead at DevIT
Hi Morgan Maguire, CEO Morgan ,

As per your provided screen shot both scrrenshot have same result. what is the different? both screenshot display same result with eduction award
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

As per above comment Morgan we are passing following search request to for Subject Navigator but we are not able to get that branch which Morgan saw us.

Could you please look into this search request and suggest us. We are able to take call and resolve this issue.

{
   "ErrorMessage":null,
   "WasError":false,
   "SearchRequest":"education award",
   "PageNum":0,
   "PageSize":0,
   "Fuzzy":true,
   "Fuzziness":1,
   "Stemming":true,
   "WordNetSynonyms":false,
   "Synonyms":false,
   "PhonicSearching":false,
   "SearchType":1,
   "SortField":null,
   "SortOrder":null,
   "SearchFlags":0,
   "Custom":null,
   "NoFrames":false,
   "EnableDateSearch":false,
   "StartDate":null,
   "EndDate":null,
   "FileConditions":null,
   "BooleanConditions":null,
   "QueryStatement":null,
   "FilterStatement":null,
   "Facets":null,
   "IxId":null,
   "IndexIds":null,
   "IncludeSynopsis":true,
   "Near":14,
   "ExcludeEnabled":false,
   "ExcludeTerm":null,
   "TreePath":null,
   "paraId":null,
   "FieldFilterName":null,
   "FieldFilterValues":null
}
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I'm getting the same results using your request payload. I'm attaching the complete response.
Maybe there's something wrong with your index? My index is 260MB in size.
I'll give you a call in a few minutes.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Thanks for taken Call.

Naomi Joanis, UX Team Lead at Industrial Naomi , Following is the comment of SN bugs.

#22162 - I already commented this one on Mar 6 and asked you to check the thesaurus config on the legacy server so that we figure out which thesaurus you're using. I didn't get any feedback on this.
Or, if you have source code of the legacy search application that can be helpful as well in figuring out thesaurus options currently used.  - It is resolved on staging.islg with help of Radomir Mladenovic, Contegra Radomir . You can check.

#22160 - As before, looks good on my end. Testing this in the Subjects Navigator index, "like" gives 273 results with stemming off, and 392 results with stemming enabled. (My screenshot sent on Mar 6 also shows this working.)
Maybe the web application is not sending the stemming checkbox value to the search service properly. - It is resolved on staging.islg with help of Radomir Mladenovic, Contegra Radomir so yo can check it. You can check

#22141 - How many results are you pulling in the result page in the legacy application? The problem is the new application is trying to get all results at once. When I test it, I get more than 32000 results and the response is 70MB. And that's without stemming and fuzzy that you have enabled in your search. - Radomir Mladenovic, Contegra Radomir will need to look into it.

#22155 - I'm not sure I understand this one. From the comment in your screenshot, I'd say you expect only those results containing all keywords in the branch name only. Is that correct? If true, that's a new requirement for me. We index all meta fields in order to support filtering but I don't think that dtSearch supports search requests limited to a single field (e.g. branch name). I'll check this.  It is resolved on staging.islg with help of Radomir Mladenovic, Contegra Radomir so yo can check it.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

When we get the Response from Subject Naviagtor the brnachname will cut. We will not getting whole brnach name. We checked in our SQL View and we provided you full Branch name.

For Example, when you search with abuse of process you get BranchId 19867 in result where branch name you get as follows :

Philip Morris v. Australia Award on Jurisdiction and Admissibility considers that the initiation of a treaty-based investor-State arbitration constitutes an abuse of rights (or an abuse of process, the rights abused being procedural in nature) when an investor has changed its corporate structure to gain the protection of an investment treaty at a point in time when a specific dispute was foreseeable; a dispute is foreseeable when there is a reasonable prospect that a measure which may give rise to



But Actual Branch Text is :

Philip Morris v. Australia Award on Jurisdiction and Admissibility considers that the initiation of a treaty-based investor-State arbitration constitutes an abuse of rights (or an abuse of process, the rights abused being procedural in nature) when an investor has changed its corporate structure to gain the protection of an investment treaty at a point in time when a specific dispute was foreseeable; a dispute is foreseeable when there is a reasonable prospect that a measure which may give rise to a treaty claim will materialize


The last line "a treaty claim will materialize" is cuted from your response.

Please check and confirm.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh , I copied the "wordnet" folder with the synonyms database to your dtsearch config folder on the server.
WordNet is already supported by the search controller. However, instead of using "Synonyms" in the search model, you need to use "WordNetSynonyms", as in:
{
    "SearchRequest": "like",
    "SearchType": "AllWords",
    "WordNetSynonyms": true,
    "PageSize": 200,
    "Fuzzy": false,
    "Fuzziness": 1,
...
That should be all you need. I tested it on my side and synonyms work.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

In which folder you copied "wordnet" folder on server ? as we can't see an folder on indexing folder.

Also, We need to change Synonyms name in your WebAPI Project's SearchModel  ?
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh , you don't need to change anything in the SearchModel - WordNetSynonyms already exists. Just send it in request when you want search with thesaurus.
WordNet folder location:

Again, you don't need to change anything here as we already setup the config folder during our call.
Harsh Parikh, Tech Lead at DevIT
OK Thanks Radomir Mladenovic, Contegra Radomir .. Please let us know why the branch text is cut when we get response from Search.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh it looks like we hit some dtSearch default length limit with the branch text. I increased the limit to the max of 8192 characters and building now a new index to test this. I hope to have an update on this within an hour.
Harsh Parikh, Tech Lead at DevIT
Hi Morgan Maguire, CEO Morgan ,

The bug no. 22155 to wrong result for education award issue is resolved on staging.islg with help of Radomir Mladenovic, Contegra Radomir . You can check and confirm.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , the issue with cut off text is fixed after bumping up the limit in dtsearch. Get the updated indexer and the search app update from the 2021-03-17 folder.
I already created Subjects Nav index with this and copied for you to E:\DevContegraISLGRebuildStagingIndexes\subject-nav-stage
 
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Could you please explain to Morgan Maguire, CEO Morgan  regrading Dispute Document Search Performance issue which we discussed today in call.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh Morgan Maguire, CEO Morgan I already commented this on Mar12:
... the problem with this query is that it gives way too many results - there are 231956 records returned for initially found 5783 matches! The response is over 200MB!
I think it doesn't make sense to return all results. With a a few different searches, someone could scrape your whole database. I think we should put a hard limit of say 1000 (or 10K) items and never return more than that. Better give an error to user to refine the query.
Naomi Joanis, UX Team Lead at Industrial
Hi Radomir Mladenovic, Contegra Radomir

Thanks for your help/responses to the above, since these bugs have moved back to UAT I'll follow up on the outstanding issue I'm seeing. 

#22162 - Synonyms > Not Working
  • Moved this card to done in TargetProcess
#22160 - Stemming > Not Working
  • Moved this card to done in TargetProcess
#22141 – Any words > Search doesn't disregard words such as "and" or "or"
  • Still in Progress with Radomir/DevIT
#22155 - All Words
  • The branch that was previously missing (under the letter D) appears, however there is an additional branch under the letter P that doesn't include both of the keywords that I'm searching with. 
Radomir Mladenovic, Contegra
Hi Naomi Joanis, UX Team Lead at Industrial Naomi , as of #22155, the word "award" appears in the document field "documenttypes": "Partial Awards or Decisions on the Merits"
Morgan Maguire, CEO
The update results for "edecuated award" look good to me based on Harsh Parikh, Tech Lead at DevIT Harsh 's explanation above. However, Naomi Joanis, UX Team Lead at Industrial Naomi could run a few more tests to make sure the results are accurate.

Thanks,

Morgan
Naomi Joanis, UX Team Lead at Industrial MarkedDone in TP
Morgan Maguire, CEO
Following-up on Radomir Mladenovic, Contegra Radomir 's comments above:

... the problem with this query is that it gives way too many results - there are 231956 records returned for initially found 5783 matches! The response is over 200MB!
I think it doesn't make sense to return all results. With a a few different searches, someone could scrape your whole database. I think we should put a hard limit of say 1000 (or 10K) items and never return more than that. Better give an error to user to refine the query.

I don't want to impose limits on the results, because this will create problems for users if they need to cast broad net searches. Also, this still doesn't explain why the search is so slow for the following example:

  • Entered "1 January 2020" into From Date field
  • Entered "31 March 2021" into To Date field
  • Select Submit Search
  • Search took 30 second to produce results with only 208 matches is result list.

This type of search should not be taking this long. 

Thanks,

Morgan
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh can you please send me the request payload for the above date range search?
Rob Wiesenberg, Contegra
Naomi Joanis, UX Team Lead at Industrial Naomi , I just spoke with Morgan Maguire, CEO Morgan   and he agreed that it would be helpful if you could test all of the various search filters and confirm that search performance is fast. If you encounter any slowness, please report back with the sample query that you used for testing.
Morgan Maguire, CEO
Hi Naomi Joanis, UX Team Lead at Industrial Naomi ​,

Following up on Rob Wiesenberg, Contegra Rob ​'s comment above, could we run tests on the Dispute & Dispute Documents search to ensure all the filters are performing at a satisfactory level from a performance perspective. Similar, to my post above, we need to ensure searches for documents using the different filtering options we generate typical searches in a timely manner.

This would include,

  • Language: English
  • Applicable Arbitration Rules: ICSID Arbitration Rules (all versions)
  • Applicable Treaty: NAFTA Chapter 11
  • Date Range: all documents from the past year
  • Document Type: Final Awards
  • Respondent State: Canada

Could you post the time it takes to generate each result. My expectation is that none of these searches should take more than 4-5 seconds to generate a result.

Thanks,

Morgan
Naomi Joanis, UX Team Lead at Industrial Yes, will do!
Rob Wiesenberg, Contegra
Morgan Maguire, CEO Morgan , just to clarify further... because the system is currently coded to retrieve all results, search of the Dispute documents will still be slow in cases where they return more than 20k results.  We can discuss further on our call tomorrow.
Morgan Maguire, CEO
Understood, Rob Wiesenberg, Contegra Rob . But I don't think that's the case with any searches above. For example applying the Language: English produces 719 matches and it took more than 30 seconds to generate the results. Therefore, the volume of the search results shouldn't be the problem.


Thanks,

Morgan
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Following is the payload request of Dispute Document module when user filter with only date range.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"boolean","Operator":"or","clauses":[{"type":"range","field":"Field_61","from":"20191231","to":"20210330"},{"type":"range","field":"Field_110","from":"20191231","to":"20210330"}]}]},"SortField":"FullCitation","SortOrder":"asc"}
Radomir Mladenovic, Contegra
Morgan Maguire, CEO Morgan , you're absolutely right that 30 seconds for 719 results is too much. The problem is for those 719 results way more data is pulled from the search index. I'll try to explain on the case of a date range search provided by Harsh in the previous message.

The search itself reports 670 results and is executes in about 150ms! However, as requested by the dev team, we don't return these results but do one more step: we collect all different disputeIds that appear in these results and return all results where this matched ContentTypeDataMasterId. This is blown to 96859 results to be collected and returned. I really don't know the data model of the application but something is very fishy to me here - I don't think that almost 100K nodes is used to represent 670 results in page. At the end, this takes tens of seconds.

I think too much pressure and expectation is put here on the search service, to do the job it's not really ideal for. The real search is the one that found data in less than 200ms. I think the second step of the search is more appropriate job for the database.

Harsh Parikh, Tech Lead at DevIT Harsh , I'd suggest taking a step back here and re-organizing some things. Instead of doing the second step in search, I'd suggest that for the Disputes search we only return collected disputeId's (e.g. as a list in the SelectedNodes) and you get data you need from the database. As we're not doing highlighting here, there's no advantage in getting these results from the search index. In addition, you could use a caching layer in your application to cache node data and make it even more efficient.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Can we take call to discuss above things as we are not clear what you want to make search faster ? 

Will take call and discuss and finalized the solution.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We have solution that if you provide me first search result from only Dispute Index then it is enough for us because we need only DisputeId (FirstId) and node name only. 

After getting result, when user click on node name then will remains our second call Dispute-doc method as it is because that call doesn't take as much time.

Let me know so we can take quick call and conclude that.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Thanks for call.

As discussed, We need first search result from Dispute Indexes where only 10000 rows you can get.

As finalized, We provided you 2 columns in FE_MetafieldwithValueDynamic  query .

  • DisputeCitation
  • IsDisputeCitation

DisputeCitation column is nodename of dispute and In IsDisputeCitation column we set 1 where DisputeCitation available and for other rows we set 0 where DisputeCitation is null.

You need to return only those DisputeId and DisputeCitation column Where IsDisputeCitation column set 1


Our Second Search call will be reamin as  it is where we pass all DisputeId collection and get result.

We have updated FE_MetafieldwithValueDynamic query on server for databse  ISLGRebuildStaging.

Let me know if you have any query.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh Thanks. I'llbuild new index and starting making changes for this. I'll let you know if I have any question.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh the indexer is breaking now because FE_MetafieldwithValueDynamic doesn't provide the RowId column any more. Is there another column that should be used as an identifier now? If now, can you please fix the RowId?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We have updated the FE_MetafieldwithValueDynamic  query on server. Now you can get RowId as identifier.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh   I modified the search controller as discussed. I'm getting results for the date filter in less than 0.5s. I hope it returns all data you were expecting.
You can get the update form 2021-03-18 folder.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

If you passed following date filter then you can get result near about 208 dispute count ?

  • Entered "1 January 2020" into From Date field
  • Entered "31 March 2021" into To Date field
Naomi Joanis, UX Team Lead at Industrial
Hi Radomir Mladenovic, Contegra Radomir

Yesterday I tested the FTS feature and there were some bugs that arose to bring up here:

#22515 - Stemming > Doesn't Work
  • check off stemming
  • type in "like"
  • enter search
  • select the first document results Methanex Corporation v. United States of America, UNCITRAL, Transcript of Hearing on Jurisdiction and Admissibility, 11 July 2001
  • View the document pdf and ctl f search for "likely" (it appears on page 493)
  • Go back to search results and select page excerpt for 493
Result:The word likely isn't highlighted

Expected:Will be highlighted if stemming is enabled


#
22517 – Synonym > Doesn't Work

  • Set synonyms to on
  • search "bias"
  • View first result
  • View page 17 excerpt
Result:The word prejudice exists but isn't highlighted

Expected:Will be highlighted


Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , yes, I'm getting 208 results for that range query.
Radomir Mladenovic, Contegra
Hi Naomi Joanis, UX Team Lead at Industrial Naomi

#22515 - Stemming > Doesn't Work

I think you might be mixing "physical" PDF page with the number shown on the page. Search results show the physical page numbers. On such page 493 there's "likeliness" I think but shows also page number 495 in the text. That means that the page 493 you were looking at is 491 physical. I opened it and it shows "likely" highlighted:
Radomir Mladenovic, Contegra
#22517 – Synonym > Doesn't Work
I've found and fixed one issue related to this. It should be fine in the search service update I provided.
Naomi Joanis, UX Team Lead at Industrial
Hi Radomir Mladenovic, Contegra Radomir ,

Re #22515 - Stemming > Doesn't Work

I understood from your response that I should be looking at the page excerpt for 491 instead of 493 where the word "likely" appears in the pdf physical numbers which makes sense. When I look at the results card in FTS I'm not seeing the page excerpt for 491. I am only seeing a result on 490 for "likeness". Based on your screenshot this seems like I should have an excerpt for page 491?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Please let us know once you will resolve above bug no.  22515 & 22517 for FTS module and provide update to us.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh ,

I don't see any issue with these:

#22517 - the example below shows both "bias" and "prejudice" highlighted when searching for "prejudice"

#22515 - yesterday I sent a screenshot showing highlighting for stemming.

Harsh Parikh, Tech Lead at DevIT Harsh   make sure you send necessary search options (Stemming and WordNetSynonyms) to "highlight-para" if you want them to work. If it still doesn't work for you, please send me your request payload.
Harsh Parikh, Tech Lead at DevIT
Radomir Mladenovic, Contegra Radomir , I will send you Payload request within half n hour for above 2 issues.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Following is the payload request of above 2 bugs.

Bug no. 22515 

{"searchRequest":"like","SearchType":"Boolean","Stemming":true,"WordNetSynonyms":false,"Fuzzy":false,"Fuzziness":"1","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"DocumentContentTypeId","values":["37","13","12"]}]},"PageNum":0,"PageSize":20}


Bug No : 22517

{"searchRequest":"bias","SearchType":"Boolean","Stemming":false,"WordNetSynonyms":true,"Fuzzy":false,"Fuzziness":"1","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"DocumentContentTypeId","values":["37","13","12"]}]},"PageNum":0,"PageSize":20}


Could you please check and let us know.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh   these are search payloads. In the above issue I believe we're talking about paragraph highlighting ("highlight-para" service method).
Harsh Parikh, Tech Lead at DevIT
OK My mistake Radomir Mladenovic, Contegra Radomir . I will provide you.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Following is my search request for highlight para.

Bug no. : 22515


{"searchRequest":"like","SearchType":"3","Stemming":"true","Synonyms":"false","Fuzzy":"false","Fuzziness":"1","paraId":"ECB6D91BB3177C32B5E0B70F4E5AC7C1#MTQ="}

Bug no. : 22517

{"searchRequest":"bias","SearchType":"3","Stemming":"false","Synonyms":"true","Fuzzy":"false","Fuzziness":"1","paraId":"80D3583E561ED87838529EE310CB553A#MTM="}
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh

#22515 - in the sample you gave me, only "like" appears. please provide another example showing that stemming is not working.

#22517 - you're sending Synonyms instead of WordNetSynonyms. It works when you put WordNetSynonyms=true.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Could you please look into this the last video of Morgan in following basecamp thread.

Dispute Documents search field does not produce any results - TOLOGIX - ISLG App Rebuild

And Provide your feedback for Dispute Document Search.

Following is my Payload request as per Morgan's video data.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["234","235"]},{"type":"match","field":"Field_DocumentTypeId","values":["1064","1067"]},{"type":"match","field":"Field_109","values":["487"]},{"type":"boolean","Operator":"or","clauses":[{"type":"range","field":"Field_61","from":"20200229","to":"20210320"},{"type":"range","field":"Field_110","from":"20200229","to":"20210320"}]}]},"SortField":"FullCitation","SortOrder":"asc"}
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

In FTS module Contegra Team Raised one issue.

Bug No. : 22502

Steps to reproduce:
  • Go to FTS
  • Set document type to arbitration rules and treaties
  • Select "international Centre for Settlement of Investment Disputes" as arbitration rule type
  • Select Free Trade Agreement (FTA) in treaty type
  • Enter the keyword "tribunal"
  • View results

Result:No results show, even though there are results for this arbitration rule type

Expected:Will see results


Following is my Search request for above scenarion.

{"searchRequest":"tribunal","SearchType":"Boolean","Stemming":false,"WordNetSynonyms":false,"Fuzzy":false,"Fuzziness":"1","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_27","values":["368"]},{"type":"match","field":"Field_34","values":["44"]},{"type":"match","field":"DocumentContentTypeId","values":["13","12"]}]},"PageNum":0,"PageSize":20}


I assume we need to set or condtiton between metafield.

Please suggest.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , as of the filtering example in Disputes, I don't see anything strange in your search request. However, I don't know which name is which in the app so hard to say what's wrong. What are field names of the filters that don't work?
Does filtering work if you use only filters on one of these fields?
Are you sure these fields are included in the FE_MetafieldwithValueDynamic?
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh as of the FTS search, I see that Field_27 and Field_34 don't have results with AND. Search works when I filter only on one of them but I didn't see the other field in the results. 
I'm not sure if you need to use OR, from the UI it doesn't look like OR would be the expected behavior. 
Make sure you're using the right field name.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

As per Morgan's Video, If we applied following filter with other filters then it does not work.

Respondent State : [Field_109] : value 487
Applicable Instrument(s) : [Field_69] : value 11333
Applicable Arbitration Rules : [Field_70] : value 11086

Above all data available in FE_MetafieldwithValueDynamic result. But Morgan said that above field with apply with other filter combination then doesn't work but if apply only those above filed filter then it works.

Let me know if you need to take call. I am available.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

For FTS bug, If we apply filter indvisual with tribunal word. For example,

Only Search [Field_27]  with tribunal word then it produce result. 22 Result count

Only Search  [Field_34] with tribunal word  then it produce result. 16 Result count.

But, Naomi expectation is if we applied both filters then we should get 38 result count.

so what we need to change in JSON search query.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh as filters work in some other cases, the first thing I'd check here is data. Can you find a row in the stored proc results that contains data matching both fields? If the data is there, please send me the IDs so that I can track it down in the indexing log and idnex.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Are you asking for FTS bug ?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

IF you available then can we take call to resolve both Dispute Document and FTS bug ?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir

For FTS Bug Following is just example of ROWId,

[Field_27] available in RowId  (1,2,3)

[Field_34] available in RowId (535,558)

We need all these rowid in result.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh , as per your previous message, the data is the problem. For FTS, one result comes from one row. If you don't have both fields in a single row, then how can it appear in results when combined using AND?

I can also confirm this differently... When I a run search with Field_34 only, I get 16 results as you said. I'm attaching result JSON and you can see there's no Field_27 in results at all. So, it was not in the data that was indexed.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Yaa I know that both [Field_34] and [Filed_27] data are not availble with Combined AND.

so that's why I asking to how to set OR condition in Payload request so will get result.

Following is my current JSON request

{"searchRequest":"tribunal","SearchType":"Boolean","Stemming":false,"WordNetSynonyms":false,"Fuzzy":false,"Fuzziness":"1","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_27","values":["368"]},{"type":"match","field":"Field_34","values":["44"]},{"type":"match","field":"DocumentContentTypeId","values":["13","12"]}]},"PageNum":0,"PageSize":20}


so, is there any way to set OR condition  in above Payload request so we can get all 38 result count.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh sorry, I didn't understand because this type of query was discussed already (e.g. my comment from Sep 14). 
Just nest the OR clause within the AND:

{
    "searchRequest": "tribunal",
    "SearchType": "Boolean",
    "Stemming": false,
    "WordNetSynonyms": false,
    "Fuzzy": false,
    "Fuzziness": "1",
    "FilterStatement": {
        "type": "boolean",
        "Operator": "and",
        "clauses": [
            {
                "type": "boolean",
                "Operator": "or",
                "clauses": [
                    {
                        "type": "match",
                        "field": "Field_27",
                        "values": [
                            "368"
                        ]
                    },
                    {
                        "type": "match",
                        "field": "Field_34",
                        "values": [
                            "44"
                        ]
                    }
                ]
            },
            {
                "type": "match",
                "field": "DocumentContentTypeId",
                "values": [
                    "13",
                    "12"
                ]
            }
        ]
    },
    "PageNum": 0,
    "PageSize": 20
} 
This one returns 38 results.
Harsh Parikh, Tech Lead at DevIT
OK Thanks Radomir Mladenovic, Contegra Radomir ..

Will try this and will update you.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Could you provide feedback of Dispute Document Result ?

-------

As per Morgan's Video, If we applied following filter with other filters then it does not work.

Respondent State : [Field_109] : value 487
Applicable Instrument(s) : [Field_69] : value 11333
Applicable Arbitration Rules : [Field_70] : value 11086

Above all data available in FE_MetafieldwithValueDynamic result. But Morgan said that above field with apply with other filter combination then doesn't work but if apply only those above filed filter then it works.

Let me know if you need to take call. I am available.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh , the same thing as with the FTS. Please find a database row in  FE_MetafieldwithValueDynamic where those fields are present together. If they are not expected to be together, then you probably need to OR them. I cannot answer this question for you as I don't know your data model. 
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

For Dispute Document module, Suppose I have search with Russian Language and Lithuania Respondent State then I should get result.

We don't need to apply OR operator the operator should be and.

Following is the Payload Search Request for Russian Language and Lithuania Respondent State:

Field_109 : Respondent State (value : 487)
Field_62 :  Language (value : 248)

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["248"]},{"type":"match","field":"Field_109","values":["487"]}]},"SortField":"FullCitation","SortOrder":"asc"}


In FE_MetafieldwithValueDynamic Query you get Field_109 value in RowId : 1181 and Field_60 value in RowId : 8829

Both Rows First Value DisputeId is 13585. so we expect to return DisputeId RowId : 1181 from First Seacrh.

And From Second Search you should pass the DisputId Collection [13585,21375] and provide result.

Please let us know. 
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh you say:

In FE_MetafieldwithValueDynamic  Query you get Field_109 value in RowId : 1181 and Field_60 value in RowId : 8829

Exactly that's why it doesn't work with AND. The fields do not belong to the same indexed row. What do you want me to do here? Any workaround that I could apply will result in a significant performance degradation. The fix should be in the database view.

As the first search is pulling data for DisputeIds, I'd say you need to think about changing FE_MetafieldwithValueDynamic in a way that the DisputeId is unique in the results (so that it can be used as an identifier instead of RowId) and you include all fields related to it. That way both Field_109 and Field_60 should appear in the same row.

If that's too complicated on your end, maybe we could make it in two steps:
  • Change FE_MetafieldwithValueDynamic so that DisputeId is used as a row identifier, but include only fields which you can.
  • Make an additional view/proc/query that returns all custom fields for a given disputeId? It could return multiple rows, doesn't matter. 
Then, I could change the indexer to run the second query for each row that appears in the FE_MetafieldwithValueDynamic, and add all found fields to the same indexed document. The indexer will probably run much longer but at least everything would be indexed properly.

I hope this makes sense.

P.S. I will be out of the office the whole morning tomorrow and can get back to you only during my afternoon.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We can not make DisputeId as unique identifier as multiple documents have associated with single dispute. so it is possible that you can get multiple dispute Id.

For the second option for two step process lets discuss today as we have only this week to complete this task.

Let us know once you are available to take call and finalized.

We are available between 10:00 AM to 6:00 PM IST.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh as I said in my previous message, I'm not available today during your work hours. I'm going on a trip in few minutes and will only be available later afternoon/tonight Europe time, when I'm back.
From everything discussed these months, I hope you understand how indexing and search work. Please provide a view (or views) that will allow us to index all metadata that you need associated with a single row in the first results table, as that will be search result item.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

It is not possible to set all metafield in single Row as search is looking for different rows.

We should go with two process as per your suggestion but we need to confirm and discuss which data will provide you in view/stored procedure.

Let us know when you are available to take call as IST time ?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

As per discussion in today's call for Dispute Document Filter.

The Initial Search Query FE_MetafieldwithValueDynamic will remain as it is. We haven't changed anything. It means you provided intial search request result from this Indexing.

As per discussion, we made new query FE_SelectDocumentViewForContegraSearch on server (dtabse : ISLGRebuildStaging) which contains all Dispute Document Metafield and DisputeID.

You can use FE_SelectDocumentViewForContegraSearch  as second step query for Initial Search .

Please let me know if you have any query.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh please add a disputeId parameter to the FE_SelectDocumentViewForContegraSearch so that it returns data for a single disputeId only.

Also, I need a separate procedure that returns only disputes! The FE_MetafieldwithValueDynamic can remain as is for the "dispute"details" call. However, for indexing, I need one procedure that returns only disputes with their metadata. Using disputeId from each row, I'd pass it to FE_SelectDocumentViewForContegraSearch to get dispute documents for each dispute.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

You can find the DisputeId column (3rd Column) in this stored procedure FE_SelectDocumentViewForContegraSearch



The new separate Procedure for only Dispute MetaField is : FE_SelectDisputeViewForContegraSearch

But, Please make sure you will provide result form this query
FE_MetafieldwithValueDynamic  as all sorting field and DisputeId Contains filed are availbele in this query.

Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh   I know that the disputeId is in the results. I need results to return data only for a particular disputeId. This method will be called for each row in the FE_SelectDisputeViewForContegraSearch.
Another option is that we read data from FE_SelectDisputeViewForContegraSearch once, cache it, and then only filter data when needed. This will be much faster but will take more RAM. I'm not sure how many rows are in this. If you think that the complete dta set can be kept in memory, we can proceed this way.

Next, the new FE_SelectDisputeViewForContegraSearch doesn't even have column disputeId. Please fix or let me know how to use it.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We set DisputeId column in FE_SelectDisputeViewForContegraSearch. 

Both query has full dataset.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I uploaded the indexer and search service update to the 2021-03-24 folder.

The indexer config for disputes should be updated - IndexStoredProcTables was modified, and DisputeDocsMetadataProc is the newproperty:
  "IndexStoredProcTables": [ "FE_SelectDisputeViewForContegraSearch" ],
  "DisputeDocsMetadataProc": "FE_SelectDocumentViewForContegraSearch",
Just a reminder: delete the old index folder before indexing.

Let me know how it worked for you.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The above changes we made and create new indexes but it doesnt work.

I don't understand why you removed FE_MetafieldwithValueDynamic from indexer because we need initial search result from this indexing.

The Following 2 query we provided you to for jus filter the data and pass the DisputeID in this indexing FE_MetafieldwithValueDynamic so we get all desired result.

FE_SelectDisputeViewForContegraSearch
FE_SelectDocumentViewForContegraSearch

Because all our model columns and sorting fields columns for initital search are available in this query FE_MetafieldwithValueDynamic.
 
Please let me know once you are available to so we can take call and discuss.
Radomir Mladenovic, Contegra
Harsh, please explain how it doesn't work. I understood for the first search you need only dispute info, and that on click you'll get the rest using "dispute-details".
If you prefer the old result format, we can keep index from the old procedure and use it for results after collecting disputed from this search.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

If I Search with Canada then it gives 0 result. Previously, It given 34 dispute record.

Now, Any search gives me null result.

Following is my search request.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_109","values":["403"]}]},"SortField":"FullCitation","SortOrder":"asc"}


But, still I don't get how you will provide result because all columns which i need in my model are availbel in this query FE_MetafieldwithValueDynamic and now you are not indexing this query result.

Please will take call and finalize.
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

Please check the things on your end. For the query you sent me, I'm getting 34 results. Response attached.

Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh   Not sure why you have DocIdColumnName: "ContentTypeDataMasterID" in your config. I have   "DocIdColumnName": "RowId" but it's like that for long time, not a new change.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Can we take quick call as i think there is some confusion for indexing.json file. ?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Could you provide your indexing json file for both Dispute & Dispute Document. so we can check again.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The First Initial Search is working now. But when we pass the second method disputes-details it doesn't give result of Dispute Document Data.

Here my search request for disputes-details service method.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["234"]}]},"SortField":"FullCitation","SortOrder":"asc","FieldFilterName":"disputeid","FieldFilterValues":[12877]}

As You remembered that we passed disputedid collection to our second query SelectDisputeContegraSearch to get dispute detail & document result.

Please let us know.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Could we connect for above query or you are looking into it ?
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I'm looking into this but have some doubts about your use of filter in the second call. I'll call you on skype.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh   updated indexer and search service are in the 2021-03-25 folder.

Regarding your previous search example, are you sure you provided a good disputeId? For disputeId 12877 I don't see data in the database.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

When I copied SearchController and run the API project it gives me foolowing error.

Severity Code Description Project File Line Suppression State
Error CS1061 'TologixSettings' does not contain a definition for 'DisputeComboIndex' and no accessible extension method 'DisputeComboIndex' accepting a first argument of type 'TologixSettings' could be found (are you missing a using directive or an assembly reference?) TologicWebSearch D:\Harsh\Harsh\Contegra Projects\TologixWebSearch\Controllers\SearchController.cs 503 Active

We have also update app.config file and set DisputeCombo Index unnder TologixSetting.




Is ther anything missed fron your side ?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Following line in SearchController display red line.

Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Still we don't get Dispute Document Detail Data from SelectDisputeContegraSearch. As we discussed, we need to pass all dispute and Disputedocument Id which we provided in this query  FE_MetafieldwithValueDynamic (DisputeId column).

Currently we can get only dispute detail when we call our second method disputes-details.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I added AppSettings.cs to the same folder but I guess you already fixed that.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

As discussed, I have update Column DisputeId in following stored procedure on our server for Database ISLGRebuildStaging.

FE_SelectDocumentViewForContegraSearch



As an example, For First Search you pass following JSON Request :

First Request  :

{"searchRequest":"A.M.F. Aircraftleasing Meier & Fischer GmbH & Co. KG v. Czech Republic, PCA Case No. 2017-15, Respondent Press Release, 5 December 2016 [Czech]","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["15541"]}]},"SortField":"FullCitation","SortOrder":"asc"}




When Click on that second method to fetch all Dispute Detail and Document detain in second request :

Second Request :


{"searchRequest":"A.M.F. Aircraftleasing Meier & Fischer GmbH & Co. KG v. Czech Republic, PCA Case No. 2017-15, Respondent Press Release, 5 December 2016 [Czech]","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"match","field":"Field_62","values":["15541"]}]},"SortField":"FullCitation","SortOrder":"asc","FieldFilterName":"disputeid","FieldFilterValues":[13406]}



I need result for both ContentTypeDataMasterId Row (13406 and 20758) from SelectDisputeContegraSearch .

13406 is our Dispute Detail
20758 is Dispute DocumentDetail
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I sent you an update in the 2021-03-26 folder. For indexing, note that the config was updated as well. I added "ContentTypeDataMasterId" to the "FacetedFields". 

Let me know if this returns expected results.
Harsh Parikh, Tech Lead at DevIT
OK Radomir Mladenovic, Contegra Radomir . I will integrate and check and get back to you.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The Search is working but one issue we found that if any dispute haven't any document then it doesn't fetch dispute data form second result.

Can we take quick call ?
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

The First Request is working now. The Second request also working but for some dispute when I pass second request it gives me bad request error and doesn't provide result.


This is my first request and it gives me 186 result and it is fine.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"boolean","Operator":"or","clauses":[{"type":"range","field":"Field_61","from":"20200229","to":"20210324"},{"type":"range","field":"Field_110","from":"20200229","to":"20210324"}]}]},"SortField":"FullCitation","SortOrder":"asc","FieldFilterName":"disputeid","FieldFilterValues":[22997]}


Second Request. When I click on dispute the second request pass then it doesn't give result it throws bad request error.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"boolean","Operator":"or","clauses":[{"type":"range","field":"Field_61","from":"20200229","to":"20210324"},{"type":"range","field":"Field_110","from":"20200229","to":"20210324"}]}]},"SortField":"FullCitation","SortOrder":"asc","FieldFilterName":"disputeid","FieldFilterValues":[22997]}




The Following second request is working.

{"searchRequest":"","FilterStatement":{"type":"boolean","Operator":"and","clauses":[{"type":"boolean","Operator":"or","clauses":[{"type":"range","field":"Field_61","from":"20200229","to":"20210320"},{"type":"range","field":"Field_110","from":"20200229","to":"20210320"}]}]},"SortField":"FullCitation","SortOrder":"asc","FieldFilterName":"disputeid","FieldFilterValues":[12400]}
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I updated the indexer, it's in the same folder. As you suspected, it was related to a dispute that has no documents.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

That was my assumption that which dispute have no documnet then it creatr issue.

But, I will check tomorrow.

Could you please provide folder name where you put updated indexer?
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh it's under 2021-03-26, I updated the installer there.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

With this new indexer the Dispute Document Indexing is not created. It is generating the error.

Here, I have attached the Indexing log file and Indexer config file.  The First Dispute Indexing is generated but the Second DisputeDocumentIndexing is not generated.  Please check.



Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh I fixed the indexer and it's under  the 2021-03-27 folder.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Hope your doing good.

Naomi Joanis, UX Team Lead at Industrial Naomi has raised 2 issues in FTS module.

Bug no. 22939

Steps to reproduce:
  • Go to FTS
  • enter "t" in keyword space
  • In filter dispute documents by > applicable treaty > select "Agreement between Japan and India for an Economic Partnership (2011) (excerpts)
  • Enter search


Result:No results found

Expected:Will see the two published dispute documents where the dispute has this instrument



Following the Payload request for above criteria and we found that we are getting 2 Rows from FE_MetafieldwithValueDynamicFTS query.

{
"searchRequest":"t",
"SearchType":"Boolean",
"Stemming":true,
"WordNetSynonyms":false,
"Fuzzy":true,
"Fuzziness":"1",
"FilterStatement":{
"type":"boolean",
"Operator":"and",
"clauses":[
{
"type":"boolean",
"Operator":"or",
"clauses":[
{
"type":"boolean",
"Operator":"and",
"clauses":[
{
"type":"match",
"field":"DocumentContentTypeId",
"values":[
"13"
]
}
]
},
{
"type":"boolean",
"Operator":"and",
"clauses":[
{
"type":"match",
"field":"DocumentContentTypeId",
"values":[
"37"
]
}
]
},
{
"type":"boolean",
"Operator":"and",
"clauses":[
{
"type":"match",
"field":"DocumentContentTypeId",
"values":[
"12"
]
}
]
}
]
},
{
"type":"boolean",
"Operator":"or",
"clauses":[
{
"type":"match",
"field":"Field_69",
"values":[
"12401"
]
}
]
}
]
},
"PageNum":0,
"PageSize":20
}


Could you please check and confirm.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Also, One more bug raised by Naomi Joanis, UX Team Lead at Industrial Naomi .

Bug No. 22938 

Steps to reproduce:
  • FTS
  • Enter "t" in the keyword search
  • In the filter by results in other research tools section, select "Act concerning the conditions of accession of the Republic of Croatia to the European union (2011) (citation and source)" from Search treaty/instrument section
  • Submit search
  • View results


Result:There are no documents found

Expected:Will see documents that are underneath this instrument in the AC

Following the Payload request for above criteria and we found that we are getting 2 Rows from FE_MetafieldwithValueDynamicFTS query.

{
"searchRequest":"t",
"SearchType":"Boolean",
"Stemming":true,
"WordNetSynonyms":false,
"Fuzzy":true,
"Fuzziness":"1",
"FilterStatement":{
"type":"boolean",
"Operator":"and",
"clauses":[
{
"type":"boolean",
"Operator":"or",
"clauses":[
{
"type":"boolean",
"Operator":"and",
"clauses":[
{
"type":"match",
"field":"DocumentContentTypeId",
"values":[
"13"
]
}
]
},
{
"type":"boolean",
"Operator":"and",
"clauses":[
{
"type":"match",
"field":"DocumentContentTypeId",
"values":[
"37"
]
}
]
},
{
"type":"boolean",
"Operator":"and",
"clauses":[
{
"type":"match",
"field":"DocumentContentTypeId",
"values":[
"12"
]
}
]
}
]
},
{
"type":"boolean",
"Operator":"or",
"clauses":[
{
"type":"match",
"field":"Field_ACReference",
"values":[
"22968"
]
},
{
"type":"match",
"field":"Field_ACProvision",
"values":[
"22968_Generally"
]
}
]
}
]
},
"PageNum":0,
"PageSize":20
}
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh Naomi Joanis, UX Team Lead at Industrial Naomi what do you expect to match with the keyword "t"? Does it appear in texts? If I remove the keyword, I get 2 results. Also, I can get results using "c" - as I see that "C" appears in the filenames.
Harsh Parikh, Tech Lead at DevIT
Naomi Joanis, UX Team Lead at Industrial Naomi , Please provide your feedback of above Radomir Mladenovic, Contegra Radomir  comment as soon as possible.
Morgan Maguire, CEO
Hi Harsh Parikh, Tech Lead at DevIT Harsh , Naomi Joanis, UX Team Lead at Industrial Naomi and Radomir Mladenovic, Contegra Radomir ,

Naomi Joanis, UX Team Lead at Industrial Naomi is correct to be concerned with the difference in results between the legacy app and the new application, and we should get to the bottom of why that is, particularly when no filter is applied.

However, further to the video below, the results for the specific bug referenced above probably have to do with the fact that the 2 documents that are generated by the filter do not have HTML documents available in staging.islg.

At the same, as described in the video, I noticed that we're displaying paragraph references vertically and horizontally. Currently, this is very inconsistent and I generally dislike the vertical alignment. Please ensure this is resolved in a way that is consistent.

Thanks,

Morgan

Harsh Parikh, Tech Lead at DevIT
Hi Naomi Joanis, UX Team Lead at Industrial Naomi ,

Please look into the issue of vertical alignment UI. We found that if Page or Paragraph count is less than 12 then it shows vertical. 
Radomir Mladenovic, Contegra
Naomi Joanis, UX Team Lead at Industrial Naomi I'm not sure if "t" is a valid search. The full text search doesn't search for a letter "T" appearing in text, but word "T". Search for "C" brings back the results so maybe "t" as a word is not present in the documents indexed. I don't know why the legacy app find these documents but possibly some difference in the content. It would require more thorough investigation comparing content, indexing, testing, etc but not sure if worth resources as I don't see "t" as a meaningful search keyword.
Harsh Parikh, Tech Lead at DevIT
Hi Morgan Maguire, CEO Morgan and Naomi Joanis, UX Team Lead at Industrial Naomi ,

The vertical alignment Page/Paragraph UI issue we resolved and uploaded on staging.islg.
Harsh Parikh, Tech Lead at DevIT
Hi Morgan Maguire, CEO Morgan and Naomi Joanis, UX Team Lead at Industrial Naomi and Radomir Mladenovic, Contegra Radomir ,

I think above both issue due to missing data & PDF, HTML file on staging.islg. Today we have generated the indexing on app.islg and it produce the result with "t" letter.
Morgan Maguire, CEO
Ok. thanks Harsh Parikh, Tech Lead at DevIT Harsh . That would explain a significant difference in the results. Naomi Joanis, UX Team Lead at Industrial Naomi let's perform testing on app.islg where we have a much more complete set of HTMLs.

Also, let's come up with a better UI solution for the alignment of the paragraphs. Naomi Joanis, UX Team Lead at Industrial Naomi , Melissa Cowell, General Manager at Industrial Melissa and Savannah Mitchell, Project Manager at Industrial Savannah , could you please ensure Harsh Parikh, Tech Lead at DevIT Harsh is provided with updates to the template code today.

Thanks,

Morgan
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We have set up indexing for go live on 6th April but when we are trying to generate Indexing for FTS module following error is logged and indexes are not created.

Following, I have attached Indexing Log and indexer json file for live databse. Also, Please note that we are using following Live database on Server for generate indexing.

Databse Name : ISLGRebuildProduction






Please check and provide feedback.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Please ignore above comment. There was issue from our side in query and we are looking into it.
Morgan Maguire, CEO
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

I noticed the DD search is currently not working on app.islg. Is this related to the indexing issues above?

Thanks,

Morgan
Harsh Parikh, Tech Lead at DevIT
Hi Morgan Maguire, CEO Morgan ,

There was minor issue to generate index on app.islg. We have resolved it and all 3 modules search is working fine on app.islg.