TOLOGIX - ISLG App Rebuild

Subject Navigator search field suggested search result does not take users to the suggested branch

Assigned to
Martin Laporte, CTO at Tologix Martin L.

Comments & Events

Paul Moon
Hi Martin Laporte, CTO at Tologix Martin :

As shown below, SN search field's suggested search does not take users to the branch selected and says "No records found" when the branch does in fact exist. I cannot isolate this issue to a class of branches. This behaviour seems similar to Re: Disputes & Dispute Documents search field and filters not working - TOLOGIX - ISLG App Rebuild where dtSearch indexing failure was the culprit according to Harsh Parikh, Tech Lead at DevIT Harsh .

Please add it to unplanned/critical and address it asap for testing, as it affects the live site users.

Thanks,

Paul
Harsh Parikh, Tech Lead at DevIT
Hi Paul Moon Paul and Martin Laporte, CTO at Tologix Martin ,

There are 2 parts we have found in above issue.

1) The Search with ""("Double question") word is not performing through DTSeacrh Indexing.

2) if we are searching with long keyword then it is not working to get result do DtSearch API. For example (if we search with, Waste Management v. Mexico II Award analyzes cases where a persistent and serious breach of a contract by a State organ can constitute expropriation, or conduct tantamount to expropriation) then result is not found. 

But, if we search with only (Waste Management v. Mexico II) then result is found.


I am putting Radomir Mladenovic, Contegra Radomir and Rob Ferguson, Team Lead - Web Development at Industrial Rob in this thread to help us for this issue.

Radomir Mladenovic, Contegra Radomir , Could you please looking above video and my comment. Please suggest how we resolve this.

Cc : Piyush Kanpariya, DevIT Piyush  
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , I suspect the issue is searching for phrases that contain quotes and other special characters but I would have to reproduce this in order to confirm.
I'm on a trip the whole next week and not sure if I'll have enough bandwidth to troubleshoot this. It would be useful if you could send me document IDs of sample documents for which search didn't work, as well as search payloads you used. Thanks.
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

Following is the search payload request for Subject Navigator module which doesn't returns the result.

Search Word : "Alter ego" objections

{"ErrorMessage":null,"WasError":false,"SearchRequest":"\"Alter ego\" objections","PageNum":0,"PageSize":0,"Fuzzy":false,"Fuzziness":1,"Stemming":true,"WordNetSynonyms":false,"Synonyms":false,"PhonicSearching":false,"SearchType":3,"SortField":null,"SortOrder":null,"SearchFlags":0,"Custom":null,"NoFrames":false,"EnableDateSearch":false,"StartDate":null,"EndDate":null,"FileConditions":null,"BooleanConditions":null,"QueryStatement":null,"FilterStatement":null,"Facets":null,"IxId":null,"IndexIds":null,"IncludeSynopsis":true,"Near":14,"ExcludeEnabled":false,"ExcludeTerm":null,"TreePath":null,"paraId":null,"FieldFilterName":null,"FieldFilterValues":null,"docId":null,"docUrl":null,"SearchTypeId":3}



Search word :  Waste Management v. Mexico II Award analyzes cases where a persistent and serious breach of a contract by a State organ can constitute expropriation, or conduct tantamount to expropriation


{"ErrorMessage":null,"WasError":false,"SearchRequest":"Waste Management v. Mexico II Award analyzes cases where a persistent and serious breach of a contract by a State organ can constitute expropriation, or conduct tantamount to expropriation","PageNum":0,"PageSize":0,"Fuzzy":false,"Fuzziness":1,"Stemming":true,"WordNetSynonyms":false,"Synonyms":false,"PhonicSearching":false,"SearchType":3,"SortField":null,"SortOrder":null,"SearchFlags":0,"Custom":null,"NoFrames":false,"EnableDateSearch":false,"StartDate":null,"EndDate":null,"FileConditions":null,"BooleanConditions":null,"QueryStatement":null,"FilterStatement":null,"Facets":null,"IxId":null,"IndexIds":null,"IncludeSynopsis":true,"Near":14,"ExcludeEnabled":false,"ExcludeTerm":null,"TreePath":null,"paraId":null,"FieldFilterName":null,"FieldFilterValues":null,"docId":null,"docUrl":null,"SearchTypeId":3}

Database : ISLGRebuildProduction
Server :  10.68.138.14



Radomir Mladenovic, Contegra Radomir , Please take note that in second search payload, if we use only following search word then we are greeting search result.

Search word :  Waste Management v. Mexico II


Cc :
Martin Laporte, CTO at Tologix Martin Paul Moon Paul Piyush Kanpariya, DevIT Piyush  
Martin Laporte, CTO at Tologix
Hi Radomir Mladenovic, Contegra Radomir ,

Can you give us an update on this issue? Would it be easier to setup a Zoom call?

Thanks,
-Martin

CC: Rob Wiesenberg, Contegra Rob  
Radomir Mladenovic, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

From what I see, there are no results found because you have HTML tags (<EM>) in the "branchname" field where the expected text appears:

"branchname": "<EM>Waste Management v. Mexico II</EM> Final Award analyzes cases where a persistent and serious breach of a contract by a State organ can constitute expropriation, or to conduct tantamount to expropriation"

dtSearch does not support HTML tags in meta fields so any tags are indexed as text and affect searching:
  • "SearchRequest": "\"Waste Management v. Mexico II\""
    This works as the phrase is within <EM>
  • "SearchRequest": "\"<EM>Waste Management v. Mexico II</EM> Final Award analyzes cases where a persistent and serious breach of a contract by a State organ can constitute expropriation, or to conduct tantamount to expropriation\""
    This works as the complete content with <EM> is included in the search phrase.
  • "SearchRequest": "\"EM Waste Management v. Mexico II EM Final Award analyzes cases where a persistent and serious breach of a contract by a State organ can constitute expropriation, or to conduct tantamount to expropriation\""
    This works as well - you see EM is still present without tag brackets
  • "SearchRequest": "\"Waste Management v. Mexico II Final Award analyzes cases where a persistent and serious breach of a contract by a State organ can constitute expropriation, or to conduct tantamount to expropriation\""
    This doesn't work.
I hope this explains it. 
Harsh Parikh, Tech Lead at DevIT
Got your point Radomir Mladenovic, Contegra Radomir .. But, also our second query is regarding  double quotation word ( "").

For Example, if we pass search word "Alter ego" objections then it doesn't work. 

Please suggest.
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh quotes are not indexed. 
  • If your search type is phrase, do not send quotes in the search request.
  • If search type is boolean for example, send with quotes around everything (e.g. "Alter ego objections")
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We can do one thing, we will add one column in strode procedure results. The column name will be BranchText.

The BranchText column doesn't include html element or double quatation.

Is it possible for you to change your logic in WEB API to search data on BranchText field instead of BranchName.

We will maintain BranchName column as it is for data display purpose.


Server : 10.68.138.13 (Web Server),  10.68.138.14 (Database Server)
SP Name : FE_GetMasterTreeSearchForGenerateIndexing
Databse Name : ISLGRebuildStaging
Module : Subject Navigator


Let me know your thoughts.

Cc : Martin Laporte, CTO at Tologix Martin Piyush Kanpariya, DevIT Piyush  
Martin Laporte, CTO at Tologix
Hi Radomir Mladenovic, Contegra Radomir ,

Before you proceed with the request above from Harsh Parikh, Tech Lead at DevIT Harsh , can you help me understand why we must pass the entire search string to dtSearch within double-quotes?

If I, as a user, type:
 Waste Management v. Mexico II Final Award 
I would expect to have a match, as I did not encapsulate my search string with double-quotes. 


On the other hand, if I type:
"Waste Management v. Mexico II Final Award"
Then I understand that in our current setup, I would not get a match since this exact string cannot be found (since we're dealing with an <em> tag in the middle).

The above examples would mimic how Google and most other search engines behave.
Do you agree?

Thanks,
-Martin
Radomir Mladenovic, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh   you can add new column. As far as I remember, no changes to indexer are needed - the new column will be picked up and indexed automatically.

Martin Laporte, CTO at Tologix Martin   quotes are not necessary. It was just to demonstrate finding exactly the same document you know you have in the database. Without the quotes it should still find it but you will get many other documents as well (e.g. if you're using "any word" type of query).
Harsh Parikh, Tech Lead at DevIT
Hi Paul Moon Paul ,

This issue has been resolved on staging.islg. Please check and confirm.

Cc : Martin Laporte, CTO at Tologix Martin  
Paul Moon
Hi Harsh Parikh, Tech Lead at DevIT Harsh :

It looks good on staging.islg. Please let me know when it is deployed to app.islg, as I'll have to let the client know. I'll leave this item open until then.

Thanks,

Paul
Martin Laporte, CTO at Tologix
Hi Paul Moon Paul ,

I tested on Production today and it looks like it's fixed.

Thanks,
-Martin
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

We have resolved the issue of double quotation word and html text word in Subject navigator module. But, As per above video by Paul those words are not going to highlighted.

Please take note that we have created one new column in result query without double quotation text and html text.

Please check from your side and let us know.



Cc : Rob Ferguson, Team Lead - Web Development at Industrial Rob Martin Laporte, CTO at Tologix Martin Piyush Kanpariya, DevIT Piyush Paul Moon Paul  
Martin Laporte, CTO at Tologix
Hi Radomir Mladenovic, Contegra Radomir ,

Can you provide your input on the latest issue reported by Paul above?

Thanks,
-Martin

CC: Rob Wiesenberg, Contegra Rob Harsh Parikh, Tech Lead at DevIT Harsh  
Radomir Mladenovic, Contegra
Hi,

Search for "Waste Management v. Mexico II Final Award analyzes cases where a persistent and serious breach of a contract by a State organ can constitute expropriation, or to conduct tantamount to expropriation" (without quotes) returns highlights in two fields:

The "branchnametext" field has all the terms highlighted, where "branchname" has highlights from"serious breach of contract".

It's similar with  "\"alter ego\" objections" - I see words properly highlighted in the branchnametext field.

I think you should use branchnametext to show highlighted fields, not the branchname. It's the same issue as discussed earlier - dtSearch does not support fields with HTML so searching and highlighting are not working properly. Simply use the plain text field version (branchnametext) instead of the original (branchname) whenever it appears in the highlighted fields section.

Hope this helps..
Harsh Parikh, Tech Lead at DevIT
Thanks Radomir Mladenovic, Contegra Radomir .

Paul Moon Paul , Could you please check this task on staging.islg.


Cc  Martin Laporte, CTO at Tologix Martin  
Paul Moon
Hi Harsh Parikh, Tech Lead at DevIT Harsh :

As shown below, "and" and "or" are not highlighted - is there a reason why?
Paul
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir ,

As per above comment by the paul, the "and", "or" this kind of noise words are not going to highlighted in Subject navigator module. As you know, we had already removed all noise words.

Can you explain the right behavior of this ?

Cc : Martin Laporte, CTO at Tologix Martin Paul Moon Paul Rob Ferguson, Team Lead - Web Development at Industrial Rob  
Rob Wiesenberg, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh

You should be able to highlight AND and OR once they are out of the noise.dat file and you have reindexed. You might try doing a quick test searching with dtSearch Desktop to see if they are still getting highlighted. Also be sure that your indexer is using the noise.dat file that you have edited. it is easy to mistakenly use another noise.dat file that may be on the system in a different location.   Let us know.

Thanks,
Rob
Harsh Parikh, Tech Lead at DevIT
Hi Rob Ferguson, Team Lead - Web Development at Industrial Rob ,

We have removed the noise words. I don't remember  why we removed noise words but when Morgan was available that time he insist us to remove those words.

Could you please explain what is the benefit to remove noise words ?

Cc : Radomir Mladenovic, Contegra Radomir Martin Laporte, CTO at Tologix Martin Paul Moon Paul  
Rob Wiesenberg, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh

Removing the noise word (aka stopwords) list (noise.dat) allows those words to be indexed and therefor can be searched and highlighted. There were use cases where this was deemed useful. Usually it is when the noise words are part of phrases and add meaning. You can review the noise.dat file to get an idea of the default terms (a, about, after, all, also, an, and, any, are, as, at, be, been, but, by, can, come, could...).

Removing the noise words will increase the index size and indexing time as these words are very frequently found in all documents. Hope this helps.
Martin Laporte, CTO at Tologix
Hi Harsh Parikh, Tech Lead at DevIT Harsh ,

I would like to recap the conversation to make sure we are all on the same page:
  • Paul Moon Paul is reporting that the noise words do not get highlighted
  • Rob Wiesenberg, Contegra Rob and Radomir Mladenovic, Contegra Radomir explained that removing the noise words from noise.dat will cause these words to be highlighted
  • Morgan asked that we remove the noise words over a year ago

Assuming the above is accurate, then it seems to me that even though we were supposed to exclude the noise words, we are currently including them.. please confirm.

-Martin
Harsh Parikh, Tech Lead at DevIT
Hi Rob Ferguson, Team Lead - Web Development at Industrial Rob and Radomir Mladenovic, Contegra Radomir ,

I have included all noise words and re-indexing but still the noise words are not going to highlighted in Subject Navigator module.



Could you please check and give your feedback.

Search text :  

Waste Management v. Mexico II Award analyzes cases where a persistent and serious breach of a contract by a State organ can constitute expropriation, or conduct tantamount to expropriation

Web Server : 10.68.138.13
DB Indexer Folder : E Drive -> ISLGRebuildStagingDBIndexer

Database Server : 10.68.138.14
Database Name : ISLGRebuildStaging


Please let me know if you need any more details.

Cc : Paul Moon Paul Martin Laporte, CTO at Tologix Martin  
Martin Laporte, CTO at Tologix
Hi Rob Wiesenberg, Contegra Rob and Radomir Mladenovic, Contegra Radomir ,

I had a chat with Harsh Parikh, Tech Lead at DevIT Harsh and he now understands that we need to REMOVE the words from the noise.dat file if we want them to be highlighted in ISLG.

He is working on re-indexing with an empty noise.dat and will report back in this thread.

No further action is needed from you at this time.

Thanks,
-Martin

CC: Paul Moon Paul  
Harsh Parikh, Tech Lead at DevIT
Hi Radomir Mladenovic, Contegra Radomir   and  Rob Wiesenberg, Contegra Rob

I have checked again with remove the noise words and re-indexing. still the noise words are not highlighted as per above screenshot.

Please check and provide your feedback.

Cc : Martin Laporte, CTO at Tologix Martin   Paul Moon Paul  
Rob Wiesenberg, Contegra
Hi Harsh Parikh, Tech Lead at DevIT Harsh , Martin Laporte, CTO at Tologix Martin , and Radomir Mladenovic, Contegra Radomir ,

Looking at the screenshot it appears that all of the noise words except the word "AND" and the word "OR" are searchable and are getting highlighted. These two words are dtSearch search operator commands so they are not treated as searchable terms by default. Let me double check to see if there is a work around. I'll let you know.
Rob Wiesenberg, Contegra
Harsh Parikh, Tech Lead at DevIT Harsh , Martin Laporte, CTO at Tologix Martin and Radomir Mladenovic, Contegra Radomir ,

I am guessing that you are sending the queries to dtSearch as Boolean requests vs Any Word or All Words. In Boolean mode the terms AND and OR are search operators and therefor not treated as search terms and thus do not get highlighted. If you switch to All Words then these terms will be highlighted. There may be other implications so not sure if it will interfere with other search logic. Please see: https://support.dtsearch.com/webhelp/dtsearchcppapi/AllWords_and_AnyWords.html
   
Paul Moon
Thanks, everyone.

Upon discussion with Martin Laporte, CTO at Tologix Martin , we'll leave app.islg as is and not make any further changes.
Paul Moon
Paul Moon completed this to-do.