dtSearch Implementation in ILG
Hello everyone,
Following-up on our kick-off call to implement dtSearch within ILG, here is a recording of meeting as well as the document that
Jitesh
circulated in advance of the meeting:
Tomorrow, we will have a call with
Melissa
and
Naomi
to discuss what updates to the requirements are needed to the keyword search feature.
Thanks,
Morgan
Following-up on our kick-off call to implement dtSearch within ILG, here is a recording of meeting as well as the document that
Tomorrow, we will have a call with
Thanks,
Morgan
Here is the recording from today's meeting to discuss the changes in the UI design:
As discussed, because it will very tight for us to complete and finalize the design across the different views within search, we'll aim to perform the dtSearch implementation in Sprint 4 between June 21st and July 12th.
We also discussed and agreed to proceed as follows:
Thanks,
Morgan
Thanks,
Radomir
Morgan
Does that work for everyone?
Morgan
Will you have enough bandwidth to produce the bulk of the requirements by June 4th?
Thanks,
--Martin
Yes, we will have wireframes for the updated requirements to demo on Thursday. Once those are confirmed we will finalize the written requirements ahead of June 4th.
Note, HTML templates will likely not be completed at that time and we will await the dev team's input there.
Mel
https://islg.egnyte.com/fl/JMJB0vjN8V
The first version of ILG dtSearch indexer and web service is available for download from:
https://1drv.ms/u/s!AugzRBG6eTFwiusxFHrIlT0MqIsfuQ?e=HU2FQe
All updates will be uploaded there as well.
Similarly to the ISLG project, invoke the indexer from the command line, specifying the config file:
Check the indexer and the web service config files and adjust them to your environment.
In the web service delivery I included sample JSON request and response (for the current content in the database). To run search, send POST request to /api/search/main
Options for the search request are the same we used in the previous project.
Project meta fields in dtSearch index are named pmf<N> where N is the meta field ID from the database (e.g. pmf38). Document meta fields are named as dmf<N>.
To find how many hits you have in the project or document meta fields, you could simply check highlightedFields of the found document object and count hits you have in fields where the name starts with "pmf" and "dmf".
Some issues I've noticed:
Obviously I don't have proper ValueIDs for this example but I hope you understand. (I excluded ContentTypeDataId column as it's not indexed.)
Using this data, I'd create two separate fields, pmf33 and pmf33id containing values and value IDs respectively.
I hope we're on a good track. Let me know if you have any questions.
Thank you for providing indexer and web service to implement Dtsearch functionality in ILG.
I am creating indexing but following error getting can you please tell me reason of this error. I did configuration setting as per my system in file name "indexer-config-ilg" in IndexJul2 folder as below.
After above changes, I am executing below command from command.
"TologixILGDBIndexer.exe indexer-config-ilg.json" but getting below error,
Can you please tell me reason of above error? I have asked same to Harsh as per ISLG implementation and he told me we need to inform to
Thanks,
Jitesh
Looks like you either don't have dtSearch dependencies installed, or you have a some dtSearch DLL in the path (maybe under system) and its version conflicts with the version I sent you.
For dtSearch dependencies, check:
https://support.dtsearch.com/dts0197.htm
http://support.dtsearch.com/webhelp/dtsearchcppapi/_NET_Deployment__MFC_and_CRT_dependencies.html
If you find that you already have dtSearch, just copy your version of dtSearch DLLs to the indexer's folder.
Hope this helps.
We need some clarification in dtsearch response which you have provided as per ILG user story requirement. so can you please arrange call to discuss about result?
We have also need result for meta-field of project and document wise. let me know we are available as per your availability.
Thanks,
Jitesh
Monday 12 PM IST work for me.
Thanks,
Jitesh
Can you please confirm your timing and url for zoom meeting?
Thanks,
Jitesh
As per discussion in today call, please provide updated web service with following features.
1, Document and project meta fields count in order to display as per wire frame for document details and project details section.
2, View entire document with highlighter as per ISLG implementation going on.
3, Meta-fields filter parameter we are trying to implement as per ISLG you have explained. we will update you if any query in that.
Thanks,
Jitesh
You can find the search service update in the 2021-07-27 folder on OneDrive.
Beside including highlightedFields with fields containing highlighted keywords, I added documentFieldsMatches and projectFieldsMatches to indicate the number of matching fields in the document and project meta data.
Yesterday I delivered solution for full document highlighting to the ISLG team. As discussed, let's wait on their feedback as some changes may be needed. That should minimize eventual issues after porting the code to the ILG.
Let me know if you need any additional clarifications for sending filters.
Thanks,
Radomir
I have added two columns "RelevanceSorting" (Number) & "DocumentNameSorting" (string) on SQLview "vw_DTSearch_DocumentDetails". There are 4 type of sorting parameter pass in "Main" API request but maximum only 1 type is passing so review below API request and provide more option for sorting.
"SortField":"RelevanceSorting", (Number
"SortOrder":"desc",
OR
"SortField":"DocumentNameSorting",
"SortOrder":"asc",
OR
"SortField":"modifiedon",
"SortOrder":"desc",
OR
"SortField":"modifiedon",
"SortOrder":"asc",
Thanks
Hiren Patel
I'm not sure I understand what you're asking me. You need to support sorting by more than one field? That's not supported by dtSearch. To sort by more than one field, you have to create additional fields for indexing so that you combine multiple fields into a sortable string, and then you use that field for sorting. However, to make results sortable by 4 fields with acs/desc order... you'd have to provide quite a number of combinations. Not sure that's the way to go.
I don't send multiple sorting parameter in request but passing dynamic "SortField" and "SortOrder" but I didn't find my updated fields on your API response. Now I have found "RelevanceSorting" and "DocumentNameSorting" fields in your API response . Is there any changes on server from your side?
Now sorting with dynamic parameters problem solved.
Thanks
Hiren Patel
Contegra search provide search functionality in PDF and HTML both files ?
Actually I have found result related to only html files. Please confirm.
Thanks
Hiren Patel
If I understand correctly, now you see sorting fields in the response? No changes are needed in the search service, you can reference this fields in the search request.
(BTW, I didn't see RelevanceSorting and DocumentNameSorting in the ILGSME database so I guess you're indexing another database?)
As of searching HTML and PDF, if HTML file exists it will be indexed and not PDF. PDF is indexed only if HTML is not available. That's similar to the ISLG project except there we also had a flag in DB to tell indexer if HTML exists, for ILG I check file existence.
Thanks,
Radomir
I have generated indexing on below database. Still there are only showing html files. I have attached screen shots here. Is anything missing from my side ?
SQL Server : 192.16.138.11
Database Name: ILGSME
SQL View : "vw_DTSearch_DocumentDetails"
Hiren Patel
I suggest checking indexing.log for any error messages.
Make sure your indexing config file properly references Highlighter as it's used for pages extraction.
I have replaced PDFHighlighterUrl value and it is working.
Thanks
Hiren Patel
I have apply filtering on metafields and randomly checked with some data but it is not getting result. I have attached screen shot and some example as below.
1) show data without filter.
2) Apply filter on metafields not getting any result.
Check with below data.
"FilterStatement": {
"type": "match",
"field": "pmf11",
"value": "Social & Defence"
},
"FilterStatement": {
"type": "match",
"field": "pmf14",
"value": "Transport - Roads"
},
"FilterStatement": {
"type": "match",
"field": "pmf15",
"value": "Announced/In Procurement"
},
"FilterStatement": {
"type": "match",
"field": "pmf18",
"value": "Ontario Infrastructure"
},
"FilterStatement": {
"type": "match",
"field": "pmf19",
"value": "Infrastructure Ontario"
},
"FilterStatement": {
"type": "match",
"field": "pmf20",
"value": "BBPP Alberta Schools"
},
"FilterStatement": {
"type": "match",
"field": "pmf21",
"value": "Babcock & Brown (International Public Partnership) Gracorp Capital Advisors"
},
Thanks
Hiren Patel
I have created one SQL view vw_DTSearch_DocumentTopics_ParaList on ILGSME database. I want to distinct count on field HtmlParagraphReference
and also distinct list of HtmlParagraphReference. It is based on documentId wise and apply search request on TagName field. It will be include in Main API request.
Let me know if further discussion is require.
Thanks
Hiren Patel
As per today's skype call some changes will require in dtSearch.
1) dtSearch filtering option is not work with special characters and more than one word so accordingly I have make some changes in SQL view vw_DTSearch_Project_Document_MetaFields with added two new columns
ValueIDs,Value1IDs which display id instead of text with multiple values by comma separated. If there is some text value then I have remove space and special characters on new columns.
Apply filtering on ValueIDs. If MetaField type is Date Range selection and Currency then apply filter on both columns ValueIDs and Value1IDs.
2) One more MetaFields category Tag(Topic) MetaFields (ContentTypeCategoryId=3) will added in vw_DTSearch_Project_Document_MetaFields for topic metafields filtering.
3) I have also added TagId column on SQL view vw_DTSearch_DocumentDetails for applying specific topics filter.
4) Keep Project and document meta fields with values as it is and generate new Project and document metafields based on IDs for filtering.
5) Update hitCount value with summation of below values.
hitCount = documentFieldsMatches + projectFieldsMatches + paragraphs count + topic paragraphs count
Let me know if further discussion is require.
Thanks
Hiren Patel
1) I extended indexer to index new columns for Value1, ValueIDs and Value1IDs, using the same field name base as before and adding suffix "_1", "ids" and "ids_1" respectively.
As of applying filters on both ValueIDs and Value1IDs in case of date range and currency, I'm afraid there's no way for the search service to know the fields' type beforehand. Your application should build an appropriate boolean filter expression using both fields.
2) OK, Tag Meta Fields will be indexed with prefix "tmf", followed by field ID, etc.
3) OK, TagId was added as a meta and can be searched:
4) OK, added as described in 1):
5) OK, changed the hitCount as requested.
The indexer update is on OneDrive in folder 2021-08-11. You mentioned that I can copy indexer updates and created indices to a folder on the server. Please confirm to which path you'd like them to be saved.
Thanks,
Radomir
Is the column DocUIN needed for searching or results? If not, let's not include it in index as it will "pollute" index with additional terms (e.g. "on") that you don't want to cause false positive matches.
DocUIN field is not require so deleted from view. Use DocumentId for join between two views vw_DTSearch_DocumentDetails and vw_DTSearch_DocumentTopics_ParaList.
Thanks
Hiren Patel
Any updates on your 27 july comments .
Yesterday I delivered solution for full document highlighting to the ISLG team. As discussed, let's wait on their feedback as some changes may be needed. That should minimize eventual issues after porting the code to the ILG.
Thanks
Hiren Patel
Thanks,
Radomir
We have started working on to integrated highlighted service in ISLG and will contact you within 4 to 5 days if we face any issue.
2) OK, Tag Meta Fields will be indexed with prefix "tmf", followed by field ID, etc.
As your comments on dated: 11 Aug 2021 for tag meta fields I have set ContentTypeCategoryId=3 and ContentTypeDataMasterId = DocumentId. I have changed in indexer-config-ilg as
"NestedQuery": "select * from vw_DTSearch_Project_Document_MetaFields where (ContentTypeDataMasterId = $DocumentContentTypeDataMasterId$ and ContentTypeCategoryId=2) or (ContentTypeDataMasterId = $ProjectContentTypeDataMasterId$ and ContentTypeCategoryId=1) or (ContentTypeCategoryId=3 and ContentTypeDataMasterId = $DocumentId$)",
I m not able to view Tag Meta Fields as tmf.Is there any other changes will require in indexer-config-ilg File ?
Thanks
Hiren Patel
You can find an update in OndeDrive folder 2021-08-15
1) Indexer has been updated to support changes to the NestedQuery for tag meta fields. I'm getting them now in results.
2) Indexer has been updated to use vw_DTSearch_DocumentTopics_ParaList and search service as well to use the fourth index "tagparas" - referenced by ParaTagsIndexDir in indexing config file, and ILGParaTagsIndex in the search service config.
I created tagParagraphs structure a bit differently to include details about matched tag - as more than one tag may be found.
Let me know if you have any questions.
Thanks,
Radomir
I have followed above process for generating new index but getting below error.
I m generating index on below path on ILG staging server.
E:\ILGIndexer\Contegra Indexer generator\indexerJul12.
Thanks
Hiren Patel
I have attached config file.
Thanks
Hiren Patel
Indexing error resolved.
Thanks
Hiren Patel
Exactly the same usage instructions apply as for the ISLG search service so below I'm just copying instructions I sent earlier to the ISLG team.
Update to for the web service with document highlighting methods is under the 2021-07-26 folder on OneDrive (https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=Sjhqhu)
You'll notice that the layout of the highlighted HTML is not the same. I think it's because dtSearch is stripping CSS from documents. I could not find a way/option to workaround this. Maybe you could inject a reference to an external css file, if HTML documents share the same styles.
I m facing below issue in highlight-html . Also pass DocumentId instead of internal_file_id because we have pass DocumentId in our system.
Thanks
Hiren Patel
Questions:
1) Do HTML/PDF files uniquely belong to rows in the vw_DTSearch_DocumentDetails? If an HTML or PDF file can be used by more that one record (different DocumentIDs), as it's the case for ISGL, then it gets complicated and we'll have to create one more index to support this.
2) As one record has fields referencing both HTML and PDF files, do you need to highlight just one (e.g. only HTML if available, otherwise PDF) or you need possibility to highlight both HTML and PDF?
Thanks,
Radomir
I have taken SearchController.cs from https://onedrive.live.com/?authkey=%21ABR6yJU9DKiLH7k&id=703179BA114433E8%21182549&cid=703179BA114433E8 and replaced with dtserarch service and published code but still getting same error. Any addition changes will require ?
1) Do HTML/PDF files uniquely belong to rows in the vw_DTSearch_DocumentDetails? If an HTML or PDF file can be used by more that one record (different DocumentIDs), as it's the case for ISGL, then it gets complicated and we'll have to create one more index to support this.
HTML/PDF files uniquely belong to rows in the vw_DTSearch_DocumentDetails. HTML and PDF files row wise uniquely associate with DocumentId in vw_DTSearch_DocumentDetails.
2) As one record has fields referencing both HTML and PDF files, do you need to highlight just one (e.g. only HTML if available, otherwise PDF) or you need possibility to highlight both HTML and PDF?
Need possibility to highlight both HTML and PDF.
If possible then we will discuss on call for better resolution. So, kindly request to arrange call as per your availability.
Thanks
Hiren Patel
As you need to highlight both HTML and PDF for the same record, we need to change indexer to build one more index with all documents. Currently, only one of them is indexed (HTML if available otherwise PDF) and it cannot be referenced for highlighting without caching.
I should have an update for you in a day or two.
Thanks,
Radomir
Currently, only one of them is indexed (HTML if available otherwise PDF) it is fine for us. No need to change in indexer. You are saying that if I have to change DocumentId instead of internal ID then error will be resolved or any changes are pending from your side ?
It is possible to connect for better understanding ?
Thanks
Hiren Patel
a) If you need HTML only (or PDF only if not available), then it's a simple change. (Also assuming that one HTML is not referenced by more than one record.)
I already made a change for this and you can find it in the 2021-08-18 folder, along with a change in the search service to accept documentId in docId.
b) If you want to be able to highlight both HTML and PDF of the same documentId, then an additional index needs to be created with according logic changes in the search service. Please let me know if you need to cover this use case.
I have taken changes from 2021-08-18 folder. Html highlighter is working but still PDF highlighter not working. I have already added "PdfHighlighterUrl": "http://192.168.1.101:8998", in indexer-config-ilg.json file. Is there any other changes will require ?
a) If you need HTML only (or PDF only if not available), then it's a simple change. (Also assuming that one HTML is not referenced by more than one record.)
I already made a change for this and you can find it in the 2021-08-18 folder, along with a change in the search service to accept documentId in docId.
I m agree with this approach. Any PDF or HTML is not referenced by more than one record.
Thanks
Hiren Patel
There is one more field data-key require in highlight-html response.
Let me know if you have any question.
Thanks
Hiren Patel
There are some additional fields require in Dtsearch which describes as below.
1) In highlight-html and highlight-pdf get all project,document and tag metafields with highlighted metafields and count. I have added 1 column name MetaFieldName in SQL vw_DTSearch_Project_Document_MetaFields which is also taken as pmf<N>name, dmf<N>name and tmf<N>name.
2) In Main add 1 more fields as pmf<N>name, dmf<N>name and tmf<N>name
same as describe above.
Let me know if you have any question.
Thanks
Hiren Patel
I have some combination of project, document and tag metafields ,based on apply filter on Main api request. Please make json request for sample filters.
ProjectMetaFieldId(pmf3ids) = 3(Prject Name) and ((Value = '178' OR Value = '215' OR Value ='252') AND (Value not equl to '145' OR Value not equal to '289' OR Value not equal '326')))
AND
ProjectMetaFieldId(pmf2ids) = 2(Date Of Concession Agreement) and ((Value is between '08/17/2021' and '08/19/2021') AND (Value is not between '08/21/2021' and '08/24/2021')))
AND
TagMetaFieldId(tmf50ids) = 50(Risk allocated to Private Party) and (Value=true AND Value=False)
AND
DocumentMetaFieldId(dmf7ids) = 7(Legal Advisors) and ((Value = '17' OR Value = '22' OR Value ='47') AND (Value not equl to '11' OR Value not equal to '67' OR Value not equal '88')))
Let me know if you have any question.
Thanks
Hiren Patel
I sent you indexer and search service update for your this morning's requirements.
I'll look into your filter examples and be back to you.
Thanks,
Radomir
(Value not equl to '145' OR Value not equal to '289' OR Value not equal '326')
If a document have a single Value, then it would match every document - because if Value=145, it's still not equal to 289 or 326 so the result would be TRUE.
Maybe you wanted:
not (Value equl to '145' OR Value equal to '289' OR Value equal '326')
?
Radomir
I think we will discuss over call then it is good for better understanding So, kindly request to arrange call as per your availability.
Thanks
Hiren Patel
Thanks,
Radomir
Monday it is too late so will explain in details.
(Value not equl to '145' OR Value not equal to '289' OR Value not equal '326')
Not equal to means exclude
If a document have a single Value, then it would match every document - because if Value=145, it's still not equal to 289 or 326 so the result would be TRUE.
Maybe you wanted:
not (Value equl to '145' OR Value equal to '289' OR Value equal '326')
Get all values exclude "145","289" and ""326"
Thanks
Hiren Patel
1. ProjectMetaFieldId(pmf3ids) = 3(Prject Name) - looks like you're trying to limit search to pmf 3 with numerical values (178, 215, 252). However, indexed values for pmf3 don't have numerical values - IDs like "AlbertaSchoolsASAPI", "Sharmilatestproject", etc.
2.For pmf2, for date ranges, you need to format dates in the YYYYMMDD format, the same as for sorting. You need to fix this in the DB view.
pmf39ids": "20050729"
3. For tmf50, you have condition (Value=true AND Value=False). For a single value field this doesn't make sense as it will always give FALSE as the result.
Thanks
Hiren Patel
{ "searchRequest": "canada", "SearchType": "Boolean", "Stemming": true, "WordNetSynonyms": false, "Fuzzy": false, "Fuzziness": "1", "FilterStatement": { "type": "boolean", "Operator": "and", "clauses": [ { "type": "boolean", "Operator": "and", "clauses": [ { "type": "boolean", "Operator": "and", "clauses": [ { "type": "match", "field": "pmf3ids", "values": [ "AlbertaSchoolsASAPI", "Sharmilatestproject" ] }, { "type": "match", "field": "pmf3ids", "exclude": true, "values": [ "145", "289", "326" ] } ] }, { "type": "range", "field": "pmf2ids", "from": "20050817", "to": "20210819" }, { "type": "match", "field": "tmf50_1", "value": "false" }, { "type": "boolean", "Operator": "and", "clauses": [ { "type": "match", "field": "pmf34ids", "values": [ "608", "22", "47" ] }, { "type": "match", "field": "pmf34ids", "exclude": true, "values": [ "11", "67", "88" ] } ] } ] } ] }, "SortField": "modifiedon", "SortOrder": "desc", "PageNum": 0, "PageSize": 20 }Thanks. I will apply filters and contact you if will get any difficulty.
Thanks
Hiren Patel
ILG certification issue is resolved.
https://dev.infrastructurelawguide.com.
Now PDF highlighter proxy is available on
https://dev.investorstatelawguide.com and it is ISLG domain so need separate proxy for ILG domain due to cross domain conflict. so create separate proxy for PDF highlighter on
https://dev.infrastructurelawguide.com.
Let me know if you have any question.
Thanks
Hiren Patel
As discussed in today's call , In PDF highlighter one more functionality is require like scroll on pdf file with highlighted word when click on listed page no .
Example :
suppose we have to search chile word and it's totally found 6 times in pdf file.
Page No 1 ( 1 time)
Page No 2 (2 time)
Page No 3 (1 time)
when click on Page No 1 then scroll up in PDF file on page no 1
When click on Page No 2 then scroll up in PDF file on page no 2
same as wise versa.
Let me know if you have any query.
Thanks
Hiren Patel
1) I have applied pdf highlighter settings on my local system (development) but in highlight-pdf generated documentUrl is not proper.
2) In highlight-html if I have search as canada, if word is found in both metafields and document then result should be display and suppose word is only found in document then do not getting any result.
3) In Main I have applied filter on pmf3ids as value AlbertaSchoolsASAPI then it is working fine but when apply filter on pmf3ids as value AnthonyHendayDriveNorthwestEdmontonRingRoad then it is not working.
Let me know if you have any question.
Thanks
Hiren Patel
I have some combination of metafields ,based on apply filter on Main api request. Please make json request for below filters.
ProjectMetaFieldId(pmf2ids) = 2(Date Of Concession Agreement)((Value is after '08/17/2021') OR (Value is before '08/24/2021')))
AND
ProjectMetaFieldId(pmf43ids) = 43(Project Cost)((Value is equal to '25 CAD') AND (Value is not equal to '30 CAD') AND (Value is greater than '80.5 CAD')
(Value is less than '500.40 CAD') AND (Value is between '10 CAD' and '600 CAD') AND (Value is not between '700 CAD' and '900 CAD'))
AND
ProjectMetaFieldId(pmf47ids) = 47(Project Debt/Equity)((Value is equal to '50') AND (Value is not equal to '10') AND (Value is greater than '15.5')
(Value is less than '60') AND (Value is between '10' and '50') AND (Value is not between '60' and '100'))
AND
ProjectMetaFieldId(pmf41ids) = 41(Concession Period)((pmf41ids < '09/22/2021' ) AND pmf41ids > '09/23/2021' ) AND pmf41ids_1 < '09/24/2021' ) AND
pmf41ids_1 > '09/25/2021' ) AND pmf41ids >= '09/25/2021' ) AND pmf41ids_1 <= '03/20/2022' ) AND pmf41ids_1 <= '09/20/2022' ))
let me know if you have any concern.
Thanks
Hiren Patel
1) Not sure what do you mean by documentUrl not being proper. I you're talking about the base URL to Highlighter, you can change that setting serviceUrl in Highlighter's application.conf to your user-facing URL to highlighter (e.g. "https://www.ilgapp.com/highlighter/")
2) Please send me indexer config you're using so that I could test the same data you're looking at and reproduce this case. I don't see document 13449 in the ILGSME sample database.
Regarding filters:
3) When searching for open date ranges (before/after), you can omit "from" or "to" field and the search service will default to min/max date.
4) dtSearch cannot search for numbers with decimal points and commas. Negative numbers are not supported either. Any such number you need to transform to a padded number without decimal point. For example, number "30.5" you could transform to "0000003050".
If you need to search with an open range, e.g. "greater than 15.5", you could search for example from "0000001551" to "9999999999".
I hope this helps. If you still need help with creating the above filters, let me know after you update the database fields with padded numbers and I'll make a sample JSON filter for the above boolean searches.
1) I have used https://dev.infrastructurelawguide.com/highlighter/ . I have attached appsettings.json for reference.
2) I have attached indexer config for reference and i have used below json request for
highlight-pdf.
{
"searchRequest": "canada",
"SearchType": "3",
"Stemming": "false",
"WordNetSynonyms": "false",
"Fuzzy": "false",
"Fuzziness": "1",
"docId": "178",
"docUrl": "https://dev.infrastructurelawguide.com/Files/PDF/Dev-Test-PDF ONly.pdf"
}
3) I have converted Numeric and number fields with padding number without decimal number with length of 15.Please generate json sample for above request.
Thanks
Hiren Patel
It is already set to http://10.68.138.10/highlighter/ which is used for both ILG and ISLG.
As of the HomeDir indexer parameter, I don't have it in my indexer config file. (I guess you copied it from some earlier version.) When HomeDir is not specified, the indexer will use folder of the indexer as home. Files that you should have there are default.abc, stemming.dat and noise.dat. I think you got them already with the indexer I sent you but I'm attaching them just in case you didn't.
I'm sending you updated indexer and web service in the OneDrive folder 2021-09-21.
The indexer has been updated to increase maximal searchable word length from default 21 chars to maximum 128 supported by dtSearch.
(https://support.dtsearch.com/webhelp/dtSearchNetApi2/dtSearch__Engine__Options.html - MaxWordLength option)
I tested search after reindexing and filtering on "pmf3ids": "TheRtHonHerbGrayParkwayformerlyWindsorEssexParkway" works now.
Search service has been updated to advice Highlighter to use URL set provided by PdfHighlighterUrl service parameter. I also made on fix to prevent error when there are no matches in meta fields. I'm not sure if that was affecting your highlighting of "economy" as from your screenshot I don't see the response code.
Please re-test this with the updated service and let me know.
I have followed your instruction mention as above but still pdf URL not generated properly.
1) In documentUrl before comma
https://dev.infrastructurelawguide.com/highlighter
is additional URL.
2) In documentUrl after comma pdf highlighter URL not highlighting any words.
"documentUrl": "https://dev.infrastructurelawguide.com/highlighter,https://dev.infrastructurelawguide.com/highlighter/viewer/?file=https%3A%2F%2Fdev.infrastructurelawguide.com%2FFiles%2FPDF%2FDev-Test-PDF%2520ONly.pdf&highlightsFile=https%3A%2F%2Fdev.infrastructurelawguide.com%2Fhighlighter%2Chttps%3A%2F%2Fdev.infrastructurelawguide.com%2Fhighlighter%2Fhits%2F44f59750b1bb3ec6c19d2a27d6e77a71&q=&lang=en&nativePrint=1&script=..%2Fexamples%2Fviewer-copy-fix.js&hlCopy=1&",
I have attached files here.
comment out means remove serviceUrl from application.conf ?
If yes then we can remove it because it is using in old ISLG production environment.
ISLG (http://dev.investorstatelawguide.com/)
we don't get your above comments. Can we take quick skype call now? I and Harsh both are available.
1. Do you use have serviceUrl currently in production for ISLG? If you do, then you cannot remove it in production as otherwise we'd need to make some changes to the ISLG application as well.
2. What happens if you remove serviceUrl in your dev Highlighter instance? Does that solve the issue? Comma that you have in the URL is very strange and I'm not sure where it's coming from. Can you run the search service in debug mode and see what's sent to Highlighter in the header?
Let me know.
"PdfHighlighterUrl": "http://10.68.138.10:8998"
and getting it back as expected:
(It may not open the document properly due to CORS issues etc but URL to the viewer is created in accordance with settings.)
I have pdf highlighter sample URL of old ISLG production application.
https://www.investorstatelawguide.com/highlighter/viewer/?file=https%3A%2F%2Fwww.investorstatelawguide.com%2Fdocuments%2Fdocuments%2FIC-0064-01%2520Waguih.pdf&highlightsFile=..%5Chits%2F25aee07dbe75dbc6a6d7dd6fb84beb52&q=&lang=en&nativePrint=1&script=..%2Fexamples%2Fviewer-copy-fix.js&hlCopy=1&
In this URL pdf highlighter viewer and PDF path is on same web application so if we will change it as http://10.68.138.10:8998 then it is not working.