Basecamp Export

Morgan Maguire, CEO

Hi everyone,

Here is the recording from today's meeting to discuss the changes in the UI design:

GMT20210520-151708_Recording_1920x1050.mp4 127 MB • Download

As discussed, because it will very tight for us to complete and finalize the design across the different views within search, we'll aim to perform the dtSearch implementation in Sprint 4 between June 21st and July 12th.

Rob

and

Radomir

let us know if you foresee a problem with this timing, because I know

Radomir

mentioned that he might be taking time off after July 4th.

We also discussed and agreed to proceed as follows:

Melissa and Naomi will complete the modified design (user stories and wireframes) for presentation and discussion during the meeting scheduled for Thursday, May 27 at 8:15am PST.
- During the meeting we will determine whether modifications to the HTML templates will be required.
Assuming the design is finalized next week, we will integrate a proof of concepts for the updated design into the Sprint 3 (May 31 - Jun 21) for the Browse by Topic search (SQL search).
Assuming the proof of concept is a success in Sprint 3, we will proceed with integrating dtSearch into the Browse by Keyword search in Sprint 4.

Please share any questions or concerns in the comments below.

Thanks,

Morgan

May 20, 2021 at 6:34 PM Notified 12 people

Radomir Mladenovic

Morgan

, Sprint 4 doesn't work for me as I'm not available from June 20 to July 2. That's why I said yesterday that I prefer earlier sprint. It would be good if we could start in Sprint 3.

Thanks,
Radomir

May 20, 2021 at 7:29 PM Notified 12 people

Morgan Maguire, CEO

Ok. Thanks

Radomir

.

Martin

,

Ketan

and

Melissa

, let me know how you think we should proceed? It's important that we perform the work when

Radomir

is available.

Morgan

May 20, 2021 at 8:45 PM Notified 12 people

Morgan Maguire, CEO

Hi everyone,

Martin

has suggested that we allocate time to dtSearch implementation in Spring #3 on the understanding that work would only start on the applicable tasks in the 2nd or 3rd week of the sprint. This would give us until June 4th to finalize the requirements of the relevant backlog items.

Does that work for everyone?

Morgan

May 20, 2021 at 10:06 PM Notified 12 people

Martin Laporte, CTO

Hi

Melissa

and

Savannah

,

Will you have enough bandwidth to produce the bulk of the requirements by June 4th?

Thanks,
--Martin

May 25, 2021 at 7:25 PM Notified 12 people

Melissa Cowell, General Manager

Martin

Yes, we will have wireframes for the updated requirements to demo on Thursday. Once those are confirmed we will finalize the written requirements ahead of June 4th.

Note, HTML templates will likely not be completed at that time and we will await the dev team's input there.

Mel

May 26, 2021 at 3:13 PM Notified 12 people

Martin Laporte, CTO

A recording of today's meeting can be found here:

https://islg.egnyte.com/fl/JMJB0vjN8V

Jun 14, 2021 at 4:23 PM Notified 12 people

Radomir Mladenovic

Hi team,

The first version of ILG dtSearch indexer and web service is available for download from:
https://1drv.ms/u/s!AugzRBG6eTFwiusxFHrIlT0MqIsfuQ?e=HU2FQe
All updates will be uploaded there as well.

Similarly to the ISLG project, invoke the indexer from the command line, specifying the config file:

TologixILGDBIndexer.exe indexer-config-ilg.json

The indexer will create 3 indexes used by the search service.

Check the indexer and the web service config files and adjust them to your environment.
In the web service delivery I included sample JSON request and response (for the current content in the database). To run search, send POST request to /api/search/main
Options for the search request are the same we used in the previous project.

Project meta fields in dtSearch index are named pmf<N> where N is the meta field ID from the database (e.g. pmf38). Document meta fields are named as dmf<N>.
To find how many hits you have in the project or document meta fields, you could simply check highlightedFields of the found document object and count hits you have in fields where the name starts with "pmf" and "dmf".

Some issues I've noticed:

Not all results have matching "paragraphs". I suspect it's because some HTML documents could not be parsed (using the logic from the ISLG project), as I saw several such errors in the indexing log.
Fields with multiple values and all concatenated by dtSearch. We should probably use some separator character(s) as we did for ISLG. (I could either do this in the indexer, or you can do it in the view.)
Filtering on field values containing more than one word or special characters doesn't work well with dtSearch. I guess most of the meta field values are keyed so we should also index their IDs and filtering should be by IDs. (As we did in ISLG.)

For example, maybe you could change the vw_DTSearch_Project_Document_MetaFields to have both value IDs and labels in separate columns, where multiple values are separated by |||. One such row could look like this:

image.png 8.41 KB • Download

Obviously I don't have proper ValueIDs for this example but I hope you understand. (I excluded ContentTypeDataId column as it's not indexed.)
Using this data, I'd create two separate fields, pmf33 and pmf33id containing values and value IDs respectively.

I hope we're on a good track. Let me know if you have any questions.

Jul 12, 2021 at 9:33 PM Notified 13 people

Jitesh Dhuravala

Hi

Radomir

,

Thank you for providing indexer and web service to implement Dtsearch functionality in ILG.

I am creating indexing but following error getting can you please tell me reason of this error. I did configuration setting as per my system in file name "indexer-config-ilg" in IndexJul2 folder as below.

image.png 50.4 KB • Download

After above changes, I am executing below command from command.
"TologixILGDBIndexer.exe indexer-config-ilg.json" but getting below error,

image.png 45.2 KB • Download

Can you please tell me reason of above error? I have asked same to Harsh as per ISLG implementation and he told me we need to inform to

Radomir

so can you please tell what we are missing or might be your side?

Thanks,
Jitesh

Jul 13, 2021 at 11:01 AM Notified 13 people

Radomir Mladenovic

Hi

Jitesh

,

Looks like you either don't have dtSearch dependencies installed, or you have a some dtSearch DLL in the path (maybe under system) and its version conflicts with the version I sent you.

For dtSearch dependencies, check:
https://support.dtsearch.com/dts0197.htm
http://support.dtsearch.com/webhelp/dtsearchcppapi/_NET_Deployment__MFC_and_CRT_dependencies.html

If you find that you already have dtSearch, just copy your version of dtSearch DLLs to the indexer's folder.

Hope this helps.

Jul 13, 2021 at 1:31 PM Notified 13 people

Jitesh Dhuravala

Hi

Radomir

,

We need some clarification in dtsearch response which you have provided as per ILG user story requirement. so can you please arrange call to discuss about result?

We have also need result for meta-field of project and document wise. let me know we are available as per your availability.

Thanks,
Jitesh

Jul 23, 2021 at 9:03 AM Notified 13 people

Radomir Mladenovic

Hi

Jitesh

, I'm available Monday or Wednesday 12 PM IST time.

Jul 24, 2021 at 6:29 AM Notified 13 people

Jitesh Dhuravala

Hi

Radomir

,

Monday 12 PM IST work for me.

Thanks,
Jitesh

Jul 24, 2021 at 6:30 AM Notified 13 people

Jitesh Dhuravala

Hi

Radomir

,

Can you please confirm your timing and url for zoom meeting?

Thanks,
Jitesh

Jul 26, 2021 at 3:51 AM Notified 13 people

Radomir Mladenovic

Hi

Jitesh

, as said above, Monday 12 PM IST (08:30 Vienna time). Please send a zoom invite or we can use Skype as before.

Jul 26, 2021 at 5:29 AM Notified 13 people

Jitesh Dhuravala

Hi

Radomir

,

As per discussion in today call, please provide updated web service with following features.

1, Document and project meta fields count in order to display as per wire frame for document details and project details section.

2, View entire document with highlighter as per ISLG implementation going on.

3, Meta-fields filter parameter we are trying to implement as per ISLG you have explained. we will update you if any query in that.

Thanks,
Jitesh

Jul 26, 2021 at 10:42 AM Notified 13 people

Radomir Mladenovic

Hi

Jitesh

,

You can find the search service update in the 2021-07-27 folder on OneDrive.

Beside including highlightedFields with fields containing highlighted keywords, I added documentFieldsMatches and projectFieldsMatches to indicate the number of matching fields in the document and project meta data.

image.png 27 KB • Download

Yesterday I delivered solution for full document highlighting to the ISLG team. As discussed, let's wait on their feedback as some changes may be needed. That should minimize eventual issues after porting the code to the ILG.

Let me know if you need any additional clarifications for sending filters.

Thanks,
Radomir

Jul 27, 2021 at 4:05 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

I have added two columns "RelevanceSorting" (Number) & "DocumentNameSorting" (string) on SQLview "vw_DTSearch_DocumentDetails". There are 4 type of sorting parameter pass in "Main" API request but maximum only 1 type is passing so review below API request and provide more option for sorting.

"SortField":"RelevanceSorting", (Number
"SortOrder":"desc",
OR
"SortField":"DocumentNameSorting",
"SortOrder":"asc",
OR
"SortField":"modifiedon",
"SortOrder":"desc",
OR
"SortField":"modifiedon",
"SortOrder":"asc",

Main_API_Shoting.png 11.7 KB • Download

Thanks
Hiren Patel

Aug 03, 2021 at 12:53 PM Notified 13 people

Radomir Mladenovic

Hi

Hiren

,
I'm not sure I understand what you're asking me. You need to support sorting by more than one field? That's not supported by dtSearch. To sort by more than one field, you have to create additional fields for indexing so that you combine multiple fields into a sortable string, and then you use that field for sorting. However, to make results sortable by 4 fields with acs/desc order... you'd have to provide quite a number of combinations. Not sure that's the way to go.

Aug 03, 2021 at 9:14 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

I don't send multiple sorting parameter in request but passing dynamic "SortField" and "SortOrder" but I didn't find my updated fields on your API response. Now I have found "RelevanceSorting" and "DocumentNameSorting" fields in your API response . Is there any changes on server from your side?

Now sorting with dynamic parameters problem solved.

Thanks
Hiren Patel

Aug 04, 2021 at 7:17 AM Notified 13 people

Hiren Patel

Radomir

,

Contegra search provide search functionality in PDF and HTML both files ?
Actually I have found result related to only html files. Please confirm.

Thanks
Hiren Patel

Aug 04, 2021 at 1:42 PM Notified 13 people

Radomir Mladenovic

Hi

Harsh

,

If I understand correctly, now you see sorting fields in the response? No changes are needed in the search service, you can reference this fields in the search request.
(BTW, I didn't see RelevanceSorting and DocumentNameSorting in the ILGSME database so I guess you're indexing another database?)

As of searching HTML and PDF, if HTML file exists it will be indexed and not PDF. PDF is indexed only if HTML is not available. That's similar to the ISLG project except there we also had a flag in DB to tell indexer if HTML exists, for ILG I check file existence.

Thanks,
Radomir

Aug 04, 2021 at 9:11 PM Notified 13 people

Radomir Mladenovic

sorry, the above message was obviously for

Hiren

Aug 05, 2021 at 8:16 AM Notified 13 people

Hiren Patel

Hi

Radomir

,

I have generated indexing on below database. Still there are only showing html files. I have attached screen shots here. Is anything missing from my side ?

SQL Server : 192.16.138.11
Database Name: ILGSME
SQL View : "vw_DTSearch_DocumentDetails"

ILG_SQL.png 36.8 KB • Download

API_Request.jpg 186 KB • Download

ILG_Serch_Document.jpg 130 KB • Download

ILG_PDF_Only.png 43.5 KB • Download

Thanks
Hiren Patel

Aug 05, 2021 at 9:34 AM Notified 13 people

Radomir Mladenovic

Hi

Hiren

, I'll run re-indexing and see what's going on with the PDF

Aug 05, 2021 at 10:47 AM Notified 13 people

Radomir Mladenovic

Hiren

, as of PDF everything looks fine to me:

image.png 61.8 KB • Download

I suggest checking indexing.log for any error messages.

Make sure your indexing config file properly references Highlighter as it's used for pages extraction.

image.png 26.9 KB • Download

Aug 05, 2021 at 11:32 AM Notified 13 people

Hiren Patel

Hi

Radomir

,

I have replaced PDFHighlighterUrl value and it is working.

Thanks
Hiren Patel

Aug 05, 2021 at 12:22 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

I have apply filtering on metafields and randomly checked with some data but it is not getting result. I have attached screen shot and some example as below.

1) show data without filter.

ProjectUIN_01.jpg 125 KB • Download

2) Apply filter on metafields not getting any result.

projectUINFilter.jpg 92.9 KB • Download

Check with below data.

"FilterStatement": {
"type": "match",
"field": "pmf11",
"value": "Social & Defence"
},
"FilterStatement": {
"type": "match",
"field": "pmf14",
"value": "Transport - Roads"
},
"FilterStatement": {
"type": "match",
"field": "pmf15",
"value": "Announced/In Procurement"
},
"FilterStatement": {
"type": "match",
"field": "pmf18",
"value": "Ontario Infrastructure"
},
"FilterStatement": {
"type": "match",
"field": "pmf19",
"value": "Infrastructure Ontario"
},
"FilterStatement": {
"type": "match",
"field": "pmf20",
"value": "BBPP Alberta Schools"
},
"FilterStatement": {
"type": "match",
"field": "pmf21",
"value": "Babcock & Brown (International Public Partnership) Gracorp Capital Advisors"
},

Thanks
Hiren Patel

Aug 06, 2021 at 10:21 AM Notified 13 people

Hiren Patel

Hi

Radomir

,

I have created one SQL view vw_DTSearch_DocumentTopics_ParaList on ILGSME database. I want to distinct count on field HtmlParagraphReference
and also distinct list of HtmlParagraphReference. It is based on documentId wise and apply search request on TagName field. It will be include in Main API request.

Paralist_Tagwise.jpg 185 KB • Download

Let me know if further discussion is require.

Thanks
Hiren Patel

Aug 06, 2021 at 3:02 PM Notified 13 people

Radomir Mladenovic

Hi

Hiren

, as of the filtering issue, please check my comment from July 12: Filtering on field values containing more than one word or special characters doesn't work well with dtSearch. I guess most of the meta field values are keyed so we should also index their IDs and filtering should be by IDs. (As we did in ISLG.)

Aug 07, 2021 at 5:13 AM Notified 13 people

Hiren Patel

Hi

Radomir

,

As per today's skype call some changes will require in dtSearch.

1) dtSearch filtering option is not work with special characters and more than one word so accordingly I have make some changes in SQL view vw_DTSearch_Project_Document_MetaFields with added two new columns
ValueIDs,Value1IDs which display id instead of text with multiple values by comma separated. If there is some text value then I have remove space and special characters on new columns.

New_Filters.jpg 241 KB • Download

Apply filtering on ValueIDs. If MetaField type is Date Range selection and Currency then apply filter on both columns ValueIDs and Value1IDs.

2) One more MetaFields category Tag(Topic) MetaFields (ContentTypeCategoryId=3) will added in vw_DTSearch_Project_Document_MetaFields for topic metafields filtering.

3) I have also added TagId column on SQL view vw_DTSearch_DocumentDetails for applying specific topics filter.

4) Keep Project and document meta fields with values as it is and generate new Project and document metafields based on IDs for filtering.

Keep_MetaFields.jpg 159 KB • Download

5) Update hitCount value with summation of below values.

hitCount = documentFieldsMatches + projectFieldsMatches + paragraphs count + topic paragraphs count

Let me know if further discussion is require.

Thanks
Hiren Patel

Aug 10, 2021 at 3:39 PM Notified 13 people

Radomir Mladenovic

Hi

Hiren

,

1) I extended indexer to index new columns for Value1, ValueIDs and Value1IDs, using the same field name base as before and adding suffix "_1", "ids" and "ids_1" respectively.
As of applying filters on both ValueIDs and Value1IDs in case of date range and currency, I'm afraid there's no way for the search service to know the fields' type beforehand. Your application should build an appropriate boolean filter expression using both fields.

2) OK, Tag Meta Fields will be indexed with prefix "tmf", followed by field ID, etc.

3) OK, TagId was added as a meta and can be searched:

image.png 6.61 KB • Download

4) OK, added as described in 1):

image.png 13.7 KB • Download

5) OK, changed the hitCount as requested.

The indexer update is on OneDrive in folder 2021-08-11. You mentioned that I can copy indexer updates and created indices to a folder on the server. Please confirm to which path you'd like them to be saved.

Thanks,
Radomir

Aug 11, 2021 at 6:58 AM Notified 13 people

Radomir Mladenovic

Hiren

regarding indexing of vw_DTSearch_DocumentTopics_ParaList, could you please give me an example of data you need in results, say for data in your screenshot?
Is the column DocUIN needed for searching or results? If not, let's not include it in index as it will "pollute" index with additional terms (e.g. "on") that you don't want to cause false positive matches.

Aug 11, 2021 at 7:05 AM Notified 13 people

Hiren Patel

Hi

Radomir

,

DocUIN field is not require so deleted from view. Use DocumentId for join between two views vw_DTSearch_DocumentDetails and vw_DTSearch_DocumentTopics_ParaList.

SQL_Tag_Paralist.jpg 69.8 KB • Download

Tag_ParaList.jpg 104 KB • Download

Thanks
Hiren Patel

Aug 11, 2021 at 8:50 AM Notified 13 people

Hiren Patel

Hi

Radomir

,

Any updates on your 27 july comments .

Yesterday I delivered solution for full document highlighting to the ISLG team. As discussed, let's wait on their feedback as some changes may be needed. That should minimize eventual issues after porting the code to the ILG.

Thanks
Hiren Patel

Aug 11, 2021 at 9:05 AM Notified 13 people

Radomir Mladenovic

Hi

Hiren

, I didn't get any feedback from the ISLG team about the highlighting so I assume all is fine. I'll add highlighting methods to the ILG search service with the next update, Tuesday at latest.

Thanks,
Radomir

Aug 12, 2021 at 8:09 AM Notified 13 people

Harsh Parikh, Tech Lead

Hi

Radomir

,

We have started working on to integrated highlighted service in ISLG and will contact you within 4 to 5 days if we face any issue.

Aug 12, 2021 at 8:19 AM Notified 13 people

Hiren Patel

Hi

Radomir

,

2) OK, Tag Meta Fields will be indexed with prefix "tmf", followed by field ID, etc.

As your comments on dated: 11 Aug 2021 for tag meta fields I have set ContentTypeCategoryId=3 and ContentTypeDataMasterId = DocumentId. I have changed in indexer-config-ilg as

"NestedQuery": "select * from vw_DTSearch_Project_Document_MetaFields where (ContentTypeDataMasterId = $DocumentContentTypeDataMasterId$ and ContentTypeCategoryId=2) or (ContentTypeDataMasterId = $ProjectContentTypeDataMasterId$ and ContentTypeCategoryId=1) or (ContentTypeCategoryId=3 and ContentTypeDataMasterId = $DocumentId$)",

DtSearch_Config.jpg 101 KB • Download

I m not able to view Tag Meta Fields as tmf.Is there any other changes will require in indexer-config-ilg File ?

Thanks
Hiren Patel

Aug 13, 2021 at 5:21 AM Notified 13 people

Radomir Mladenovic

Hi

Hiren

,

You can find an update in OndeDrive folder 2021-08-15

1) Indexer has been updated to support changes to the NestedQuery for tag meta fields. I'm getting them now in results.

2) Indexer has been updated to use vw_DTSearch_DocumentTopics_ParaList and search service as well to use the fourth index "tagparas" - referenced by ParaTagsIndexDir in indexing config file, and ILGParaTagsIndex in the search service config.

I created tagParagraphs structure a bit differently to include details about matched tag - as more than one tag may be found.

image.png 52.8 KB • Download

Let me know if you have any questions.

Thanks,
Radomir

Aug 15, 2021 at 7:52 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

I have followed above process for generating new index but getting below error.

indexing.log 1.67 KB • Download

I m generating index on below path on ILG staging server.
E:\ILGIndexer\Contegra Indexer generator\indexerJul12.

Thanks
Hiren Patel

Aug 16, 2021 at 10:52 AM Notified 13 people

Radomir Mladenovic

Hi

Hiren

, it looks like there's an JSON parsing error with your indexer config file. Please check or send it to me if you're not sure what's wrong.

Aug 16, 2021 at 11:51 AM Notified 13 people

Hiren Patel

Hi

Radomir

,

I have attached config file.

indexer-config-ilg.json 1.2 KB • Download

Thanks
Hiren Patel

Aug 16, 2021 at 12:11 PM Notified 13 people

Radomir Mladenovic

Hi

Hiren

, in the indexer config parameter name should be ParaTagsIndexDir, in the web service it's ILGParaTagsIndex. You mixed these. Check config samples that I put on OneDrive.

Aug 16, 2021 at 2:17 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

Indexing error resolved.

Thanks
Hiren Patel

Aug 16, 2021 at 3:30 PM Notified 13 people

Radomir Mladenovic

Hi

Hiren

, I sent you search service update in the 2021-08-17 folder, extended with HTML and PDF highlighting methods.
Exactly the same usage instructions apply as for the ISLG search service so below I'm just copying instructions I sent earlier to the ISLG team.

Update to for the web service with document highlighting methods is under the 2021-07-26 folder on OneDrive (https://1drv.ms/u/s!AugzRBG6eTFwg4sJwygyvkWxOQnMWg?e=Sjhqhu)

I'm sending you all project files as there are several changes, and example request and response JSON payloads for HTML and PDF highlighting.

For HTML highlight use /highlight-html. The payload contains basic search parameters, similarly to paragraph highlight, and docId parameter which is internal_file_id field of the document you want to show.

The response contains the highlighted HTML document (in the "content" field) and hits by paragraph under the "paragraphHits".

image.png 19 KB • Download

You'll notice that the layout of the highlighted HTML is not the same. I think it's because dtSearch is stripping CSS from documents. I could not find a way/option to workaround this. Maybe you could inject a reference to an external css file, if HTML documents share the same styles.

To highlight a PDF, use the /highlight-pdf method. The payload contains search parameters, docId parameter, and docUrl that should be URL to the PDF in your web application.

The response payload contains documentUrl that you should open in your web application.

You need to add "PdfHighlighterUrl" parameter to the appsettings.json file, with URL to PDF Highlighter instance.

Let me know if you have any questions.

Aug 16, 2021 at 10:32 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

I m facing below issue in highlight-html . Also pass DocumentId instead of internal_file_id because we have pass DocumentId in our system.

HTML-Highlighter-Issue.jpg 89.6 KB • Download

Thanks
Hiren Patel

Aug 17, 2021 at 9:03 AM Notified 13 people

Radomir Mladenovic

Hi

Hiren

, sorry I missed that error because I was creating index with enabled option for caching original documents (html and pdf) in the index. Then dtSearch can highlight file without access to the original. However, it doesn't make much sense to cache documents in the index (as it will be huge in production) when they're available on the server.

Questions:

1) Do HTML/PDF files uniquely belong to rows in the vw_DTSearch_DocumentDetails? If an HTML or PDF file can be used by more that one record (different DocumentIDs), as it's the case for ISGL, then it gets complicated and we'll have to create one more index to support this.

2) As one record has fields referencing both HTML and PDF files, do you need to highlight just one (e.g. only HTML if available, otherwise PDF) or you need possibility to highlight both HTML and PDF?

Thanks,
Radomir

Aug 17, 2021 at 8:30 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

I have taken SearchController.cs from https://onedrive.live.com/?authkey=%21ABR6yJU9DKiLH7k&id=703179BA114433E8%21182549&cid=703179BA114433E8 and replaced with dtserarch service and published code but still getting same error. Any addition changes will require ?

1) Do HTML/PDF files uniquely belong to rows in the vw_DTSearch_DocumentDetails? If an HTML or PDF file can be used by more that one record (different DocumentIDs), as it's the case for ISGL, then it gets complicated and we'll have to create one more index to support this.

HTML/PDF files uniquely belong to rows in the vw_DTSearch_DocumentDetails. HTML and PDF files row wise uniquely associate with DocumentId in vw_DTSearch_DocumentDetails.

2) As one record has fields referencing both HTML and PDF files, do you need to highlight just one (e.g. only HTML if available, otherwise PDF) or you need possibility to highlight both HTML and PDF?

Need possibility to highlight both HTML and PDF.

If possible then we will discuss on call for better resolution. So, kindly request to arrange call as per your availability.

Thanks
Hiren Patel

Aug 18, 2021 at 5:19 AM Notified 13 people

Radomir Mladenovic

Hi

Hiren

, the changes I made in the controller are just to reference file by DocumentId instead of the internal ID. It does not address the highlighting issue.
As you need to highlight both HTML and PDF for the same record, we need to change indexer to build one more index with all documents. Currently, only one of them is indexed (HTML if available otherwise PDF) and it cannot be referenced for highlighting without caching.
I should have an update for you in a day or two.

Thanks,
Radomir

Aug 18, 2021 at 7:30 AM Notified 13 people

Hiren Patel

Hi

Radomir

,

Currently, only one of them is indexed (HTML if available otherwise PDF) it is fine for us. No need to change in indexer. You are saying that if I have to change DocumentId instead of internal ID then error will be resolved or any changes are pending from your side ?

It is possible to connect for better understanding ?

Highlight-html.jpg 83.9 KB • Download

PDF-Highlighter.jpg 71.5 KB • Download

Thanks
Hiren Patel

Aug 18, 2021 at 7:49 AM Notified 13 people

Radomir Mladenovic

Hi

Hiren

, a change in indexer is needed in any case. However, there are two approaches:

a) If you need HTML only (or PDF only if not available), then it's a simple change. (Also assuming that one HTML is not referenced by more than one record.)
I already made a change for this and you can find it in the 2021-08-18 folder, along with a change in the search service to accept documentId in docId.

b) If you want to be able to highlight both HTML and PDF of the same documentId, then an additional index needs to be created with according logic changes in the search service. Please let me know if you need to cover this use case.

Aug 18, 2021 at 12:12 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

I have taken changes from 2021-08-18 folder. Html highlighter is working but still PDF highlighter not working. I have already added "PdfHighlighterUrl": "http://192.168.1.101:8998", in indexer-config-ilg.json file. Is there any other changes will require ?

00-PDF-Highlighter.jpg 56.9 KB • Download

a) If you need HTML only (or PDF only if not available), then it's a simple change. (Also assuming that one HTML is not referenced by more than one record.)
I already made a change for this and you can find it in the 2021-08-18 folder, along with a change in the search service to accept documentId in docId.

I m agree with this approach. Any PDF or HTML is not referenced by more than one record.

Thanks
Hiren Patel

Aug 18, 2021 at 1:57 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

There is one more field data-key require in highlight-html response.

data-key.jpg 145 KB • Download

data-key-response.jpg 115 KB • Download

Let me know if you have any question.

Thanks
Hiren Patel

Aug 19, 2021 at 11:36 AM Notified 13 people

Radomir Mladenovic

Hi

Hiren

, I sent you an update (2021-08-19) that adds the dataKey field to the highlight-html response.

Aug 19, 2021 at 7:34 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

There are some additional fields require in Dtsearch which describes as below.

1) In highlight-html and highlight-pdf get all project,document and tag metafields with highlighted metafields and count. I have added 1 column name MetaFieldName in SQL vw_DTSearch_Project_Document_MetaFields which is also taken as pmf<N>name, dmf<N>name and tmf<N>name.

Html-Highlighter-metaFields.jpg 166 KB • Download

2) In Main add 1 more fields as pmf<N>name, dmf<N>name and tmf<N>name
same as describe above.

Let me know if you have any question.

Thanks
Hiren Patel

Aug 24, 2021 at 5:33 AM Notified 13 people

Hiren Patel

Hi

Radomir

,

I have some combination of project, document and tag metafields ,based on apply filter on Main api request. Please make json request for sample filters.

ProjectMetaFieldId(pmf3ids) = 3(Prject Name) and ((Value = '178' OR Value = '215' OR Value ='252') AND (Value not equl to '145' OR Value not equal to '289' OR Value not equal '326')))

AND

ProjectMetaFieldId(pmf2ids) = 2(Date Of Concession Agreement) and ((Value is between '08/17/2021' and '08/19/2021') AND (Value is not between '08/21/2021' and '08/24/2021')))

AND

TagMetaFieldId(tmf50ids) = 50(Risk allocated to Private Party) and (Value=true AND Value=False)

AND

DocumentMetaFieldId(dmf7ids) = 7(Legal Advisors) and ((Value = '17' OR Value = '22' OR Value ='47') AND (Value not equl to '11' OR Value not equal to '67' OR Value not equal '88')))

Let me know if you have any question.

Thanks
Hiren Patel

Aug 24, 2021 at 4:16 PM Notified 13 people

Radomir Mladenovic

HI

Hiren

,

I sent you indexer and search service update for your this morning's requirements.

I'll look into your filter examples and be back to you.

Thanks,
Radomir

Aug 24, 2021 at 6:25 PM Notified 13 people

Radomir Mladenovic

Hi

Hiren

, could you please explain what should be expected behavior for expression:

(Value not equl to '145' OR Value not equal to '289' OR Value not equal '326')

If a document have a single Value, then it would match every document - because if Value=145, it's still not equal to 289 or 326 so the result would be TRUE.

Maybe you wanted:

not (Value equl to '145' OR Value equal to '289' OR Value equal '326')

?

Aug 25, 2021 at 5:43 PM Notified 13 people

Radomir Mladenovic

Hi

Hiren

, I'm having a hard time trying to understand some of conditions. For example:

ProjectMetaFieldId(pmf3ids) = 3(Prject Name) - looks like you're trying to limit search to pmf 3 with numerical values (178, 215, 252). However, indexed values for pmf3 don't have numerical values - IDs like "AlbertaSchoolsASAPI", "Sharmilatestproject", etc.
For pmf2, for date ranges, you need to format dates in the YYYYMMDD format, the same as for sorting. You need to fix this in the DB view.
For tmf50, you have condition (Value=true AND Value=False). For a single value field this doesn't make sense as it will always give FALSE as the result.

Thanks,
Radomir

Aug 25, 2021 at 6:58 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

I think we will discuss over call then it is good for better understanding So, kindly request to arrange call as per your availability.

Thanks
Hiren Patel

Aug 26, 2021 at 4:30 AM Notified 13 people

Radomir Mladenovic

Hi

Hiren

, the earliest I'm available for a call is Monday. If that's too late for you, please provide details here and I'll be back to you later today.

Thanks,
Radomir

Aug 26, 2021 at 8:05 AM Notified 13 people

Hiren Patel

Hi

Radomir

,

Monday it is too late so will explain in details.

(Value not equl to '145' OR Value not equal to '289' OR Value not equal '326')

Not equal to means exclude

If a document have a single Value, then it would match every document - because if Value=145, it's still not equal to 289 or 326 so the result would be TRUE.

Maybe you wanted:

not (Value equl to '145' OR Value equal to '289' OR Value equal '326')

Get all values exclude "145","289" and ""326"

Thanks
Hiren Patel

Aug 26, 2021 at 8:23 AM Notified 13 people

Radomir Mladenovic

Hiren

, what about my other questions from last night?

Aug 26, 2021 at 1:06 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

1. ProjectMetaFieldId(pmf3ids) = 3(Prject Name) - looks like you're trying to limit search to pmf 3 with numerical values (178, 215, 252). However, indexed values for pmf3 don't have numerical values - IDs like "AlbertaSchoolsASAPI", "Sharmilatestproject", etc.

You have to apply filters on ids fields like pmf3ids

2.For pmf2, for date ranges, you need to format dates in the YYYYMMDD format, the same as for sorting. You need to fix this in the DB view.

I have already converted Date format as YYYYMMDD in the DB view.
pmf39ids": "20050729"

3. For tmf50, you have condition (Value=true AND Value=False). For a single value field this doesn't make sense as it will always give FALSE as the result.

Value=true AND Value=False both are not possible on same time so it is not return any result.

MetaFields-Filters.jpg 154 KB • Download

Thanks
Hiren Patel

Aug 26, 2021 at 1:43 PM Notified 13 people

Radomir Mladenovic

Hi

Hiren

, here's an example filter, close to what you asked for, except I used IDs that appear in the sample content in order to get results.

{
    "searchRequest": "canada",
    "SearchType": "Boolean",
    "Stemming": true,
    "WordNetSynonyms": false,
    "Fuzzy": false,
    "Fuzziness": "1",
    "FilterStatement": {
        "type": "boolean",
        "Operator": "and",
        "clauses": [
            {
                "type": "boolean",
                "Operator": "and",
                "clauses": [
                    {
                        "type": "boolean",
                        "Operator": "and",
                        "clauses": [
                            {
                                "type": "match",
                                "field": "pmf3ids",
                                "values": [
                                    "AlbertaSchoolsASAPI",
                                    "Sharmilatestproject"
                                ]
                            },
                            {
                                "type": "match",
                                "field": "pmf3ids",
                                "exclude": true,
                                "values": [
                                    "145",
                                    "289",
                                    "326"
                                ]
                            }
                        ]
                    },
                    {
                        "type": "range",
                        "field": "pmf2ids",
                        "from": "20050817",
                        "to": "20210819"
                    },
                    {
                        "type": "match",
                        "field": "tmf50_1",
                        "value": "false"
                    },
                    {
                        "type": "boolean",
                        "Operator": "and",
                        "clauses": [
                            {
                                "type": "match",
                                "field": "pmf34ids",
                                "values": [
                                    "608",
                                    "22",
                                    "47"
                                ]
                            },
                            {
                                "type": "match",
                                "field": "pmf34ids",
                                "exclude": true,
                                "values": [
                                    "11",
                                    "67",
                                    "88"
                                ]
                            }
                        ]
                    }
                ]
            }
        ]
    },
    "SortField": "modifiedon",
    "SortOrder": "desc",
    "PageNum": 0,
    "PageSize": 20
}

Aug 26, 2021 at 6:38 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

Thanks. I will apply filters and contact you if will get any difficulty.

Thanks
Hiren Patel

Aug 27, 2021 at 4:39 AM Notified 13 people

Hiren Patel

Hi

Radomir

,

ILG certification issue is resolved.

https://dev.infrastructurelawguide.com.

Now PDF highlighter proxy is available on

https://dev.investorstatelawguide.com and it is ISLG domain so need separate proxy for ILG domain due to cross domain conflict. so create separate proxy for PDF highlighter on
https://dev.infrastructurelawguide.com.

Cross_Domain_Error.jpg 458 KB • Download

Let me know if you have any question.

Thanks
Hiren Patel

Sep 01, 2021 at 1:21 PM Notified 13 people

Radomir Mladenovic

Hi

Hiren

, sorry, I don't maintain your IIS - at least I did not so far. You just need to create one more proxy instance at /highlighter path on the instance where you need it. You can use the same proxy installation folder as it doesn't require any settings (when proxing to Highlighter on the same system).

Sep 02, 2021 at 1:16 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

As discussed in today's call , In PDF highlighter one more functionality is require like scroll on pdf file with highlighted word when click on listed page no .

Example :

suppose we have to search chile word and it's totally found 6 times in pdf file.

Page No 1 ( 1 time)
Page No 2 (2 time)
Page No 3 (1 time)

when click on Page No 1 then scroll up in PDF file on page no 1
When click on Page No 2 then scroll up in PDF file on page no 2
same as wise versa.

Let me know if you have any query.

PDF-Highlighter-Move.jpg 281 KB • Download

Thanks
Hiren Patel

Sep 10, 2021 at 4:37 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

1) I have applied pdf highlighter settings on my local system (development) but in highlight-pdf generated documentUrl is not proper.

pdf-highlighter.jpg 134 KB • Download

2) In highlight-html if I have search as canada, if word is found in both metafields and document then result should be display and suppose word is only found in document then do not getting any result.

html-highlighter-htmlcontent.jpg 237 KB • Download

html-highlighter-working.jpg 182 KB • Download

html-highlighter-error.jpg 58.9 KB • Download

3) In Main I have applied filter on pmf3ids as value AlbertaSchoolsASAPI then it is working fine but when apply filter on pmf3ids as value AnthonyHendayDriveNorthwestEdmontonRingRoad then it is not working.

filter-meta-working.jpg 144 KB • Download

filter-meta-error.jpg 102 KB • Download

Main-Filter.jpg 145 KB • Download

Let me know if you have any question.

Thanks
Hiren Patel

Sep 17, 2021 at 12:32 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

I have some combination of metafields ,based on apply filter on Main api request. Please make json request for below filters.

ProjectMetaFieldId(pmf2ids) = 2(Date Of Concession Agreement)((Value is after '08/17/2021') OR (Value is before '08/24/2021')))

AND

ProjectMetaFieldId(pmf43ids) = 43(Project Cost)((Value is equal to '25 CAD') AND (Value is not equal to '30 CAD') AND (Value is greater than '80.5 CAD')
(Value is less than '500.40 CAD') AND (Value is between '10 CAD' and '600 CAD') AND (Value is not between '700 CAD' and '900 CAD'))

AND

ProjectMetaFieldId(pmf47ids) = 47(Project Debt/Equity)((Value is equal to '50') AND (Value is not equal to '10') AND (Value is greater than '15.5')
(Value is less than '60') AND (Value is between '10' and '50') AND (Value is not between '60' and '100'))

AND

ProjectMetaFieldId(pmf41ids) = 41(Concession Period)((pmf41ids < '09/22/2021' ) AND pmf41ids > '09/23/2021' ) AND pmf41ids_1 < '09/24/2021' ) AND
pmf41ids_1 > '09/25/2021' ) AND pmf41ids >= '09/25/2021' ) AND pmf41ids_1 <= '03/20/2022' ) AND pmf41ids_1 <= '09/20/2022' ))

Agreement-Date-API.jpg 151 KB • Download

agreement-date-filter.jpg 46.9 KB • Download

Main-API-Project-cost.jpg 137 KB • Download

Project-cost-filter.jpg 105 KB • Download

Main-API-debt.jpg 134 KB • Download

Project-debt-filter.jpg 104 KB • Download

Concession-periods.jpg 165 KB • Download

let me know if you have any concern.

Thanks
Hiren Patel

Sep 20, 2021 at 3:01 PM Notified 13 people

Radomir Mladenovic

Hi

Hiren

,

1) Not sure what do you mean by documentUrl not being proper. I you're talking about the base URL to Highlighter, you can change that setting serviceUrl in Highlighter's application.conf to your user-facing URL to highlighter (e.g. "https://www.ilgapp.com/highlighter/")

2) Please send me indexer config you're using so that I could test the same data you're looking at and reproduce this case. I don't see document 13449 in the ILGSME sample database.

Regarding filters:

3) When searching for open date ranges (before/after), you can omit "from" or "to" field and the search service will default to min/max date.

4) dtSearch cannot search for numbers with decimal points and commas. Negative numbers are not supported either. Any such number you need to transform to a padded number without decimal point. For example, number "30.5" you could transform to "0000003050".
If you need to search with an open range, e.g. "greater than 15.5", you could search for example from "0000001551" to "9999999999".

I hope this helps. If you still need help with creating the above filters, let me know after you update the database fields with padded numbers and I'll make a sample JSON filter for the above boolean searches.

Sep 20, 2021 at 9:48 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

1) I have used https://dev.infrastructurelawguide.com/highlighter/ . I have attached appsettings.json for reference.

appsettings.json 1.39 KB • Download

2) I have attached indexer config for reference and i have used below json request for
highlight-pdf.

{
"searchRequest": "canada",
"SearchType": "3",
"Stemming": "false",
"WordNetSynonyms": "false",
"Fuzzy": "false",
"Fuzziness": "1",
"docId": "178",
"docUrl": "https://dev.infrastructurelawguide.com/Files/PDF/Dev-Test-PDF ONly.pdf"
}

indexer-config-ilg.json 1.23 KB • Download

3) I have converted Numeric and number fields with padding number without decimal number with length of 15.Please generate json sample for above request.

Numer-converted.jpg 118 KB • Download

Thanks
Hiren Patel

Sep 21, 2021 at 9:09 AM Notified 13 people

Radomir Mladenovic

Hiren

1) you need to set serviceUrl in the application.conf of Highlighter! (Program Files\Highlighter\conf\)

Sep 21, 2021 at 12:06 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

It is already set to http://10.68.138.10/highlighter/ which is used for both ILG and ISLG.

application.conf 4.17 KB • Download

Sep 21, 2021 at 12:15 PM Notified 13 people

Radomir Mladenovic

Hi

Hiren

,

As of the HomeDir indexer parameter, I don't have it in my indexer config file. (I guess you copied it from some earlier version.) When HomeDir is not specified, the indexer will use folder of the indexer as home. Files that you should have there are default.abc, stemming.dat and noise.dat. I think you got them already with the indexer I sent you but I'm attaching them just in case you didn't.

default.abc 901 Bytes • Download

noise.dat 541 Bytes • Download

stemming.dat 2.49 KB • Download

HomeDir in the appsettings.json you can point to the indexer folder, or create a new folder just with these 3 files, whichever you prefer.

I'm sending you updated indexer and web service in the OneDrive folder 2021-09-21.

The indexer has been updated to increase maximal searchable word length from default 21 chars to maximum 128 supported by dtSearch.
(https://support.dtsearch.com/webhelp/dtSearchNetApi2/dtSearch__Engine__Options.html - MaxWordLength option)

I tested search after reindexing and filtering on "pmf3ids": "TheRtHonHerbGrayParkwayformerlyWindsorEssexParkway" works now.

Search service has been updated to advice Highlighter to use URL set provided by PdfHighlighterUrl service parameter. I also made on fix to prevent error when there are no matches in meta fields. I'm not sure if that was affecting your highlighting of "economy" as from your screenshot I don't see the response code.
Please re-test this with the updated service and let me know.

Sep 21, 2021 at 8:29 PM Notified 13 people

Hiren Patel

Hi

Radomir

,

I have followed your instruction mention as above but still pdf URL not generated properly.

PDF-Highlighter-URL.jpg 161 KB • Download

1) In documentUrl before comma
https://dev.infrastructurelawguide.com/highlighter
is additional URL.
2) In documentUrl after comma pdf highlighter URL not highlighting any words.

"documentUrl": "https://dev.infrastructurelawguide.com/highlighter,https://dev.infrastructurelawguide.com/highlighter/viewer/?file=https%3A%2F%2Fdev.infrastructurelawguide.com%2FFiles%2FPDF%2FDev-Test-PDF%2520ONly.pdf&highlightsFile=https%3A%2F%2Fdev.infrastructurelawguide.com%2Fhighlighter%2Chttps%3A%2F%2Fdev.infrastructurelawguide.com%2Fhighlighter%2Fhits%2F44f59750b1bb3ec6c19d2a27d6e77a71&q=&lang=en&nativePrint=1&script=..%2Fexamples%2Fviewer-copy-fix.js&hlCopy=1&",

Sep 23, 2021 at 7:08 AM Notified 13 people

Radomir Mladenovic

Hiren

that's strange. Where that comma is coming from? Can you please send me your application settings and Highlighter's application.conf?

Sep 23, 2021 at 7:52 AM Notified 13 people

Hiren Patel

Radomir

,

I have attached files here.

application.conf 4.17 KB • Download

appsettings.json 1.39 KB • Download

Sep 23, 2021 at 8:01 AM Notified 13 people

Radomir Mladenovic

Hiren

in the application.conf comment out serviceUrl and restart Highlighter. Let me know if that helped.

Sep 23, 2021 at 8:12 AM Notified 13 people

Hiren Patel

Radomir

,

comment out means remove serviceUrl from application.conf ?

If yes then we can remove it because it is using in old ISLG production environment.
ISLG (http://dev.investorstatelawguide.com/)

Sep 23, 2021 at 8:21 AM Notified 13 people

Radomir Mladenovic

Hiren

, yes, I meant to remove it. But, check what the current production Highlighter is using. If it has serviceUrl set then keep it as we'll need to make this work with the current production Highlighter so we don't have to change the ISLG application as well.

Sep 23, 2021 at 9:41 AM Notified 13 people

Hiren Patel

Radomir

,

we don't get your above comments. Can we take quick skype call now? I and Harsh both are available.

Sep 23, 2021 at 9:46 AM Notified 13 people

Radomir Mladenovic

Hiren

, sorry, I'm not available for a call today.

1. Do you use have serviceUrl currently in production for ISLG? If you do, then you cannot remove it in production as otherwise we'd need to make some changes to the ISLG application as well.

2. What happens if you remove serviceUrl in your dev Highlighter instance? Does that solve the issue? Comma that you have in the URL is very strange and I'm not sure where it's coming from. Can you run the search service in debug mode and see what's sent to Highlighter in the header?

Let me know.

Sep 23, 2021 at 11:22 AM Notified 13 people

Radomir Mladenovic

Hiren

I cannot reproduce the issue you have. In the appsettings.json I have:

"PdfHighlighterUrl": "http://10.68.138.10:8998"

and getting it back as expected:

image.png 18.8 KB • Download

(It may not open the document properly due to CORS issues etc but URL to the viewer is created in accordance with settings.)

Sep 23, 2021 at 7:53 PM Notified 13 people

Hiren Patel

Radomir

,

I have pdf highlighter sample URL of old ISLG production application.

https://www.investorstatelawguide.com/highlighter/viewer/?file=https%3A%2F%2Fwww.investorstatelawguide.com%2Fdocuments%2Fdocuments%2FIC-0064-01%2520Waguih.pdf&highlightsFile=..%5Chits%2F25aee07dbe75dbc6a6d7dd6fb84beb52&q=&lang=en&nativePrint=1&script=..%2Fexamples%2Fviewer-copy-fix.js&hlCopy=1&

In this URL pdf highlighter viewer and PDF path is on same web application so if we will change it as http://10.68.138.10:8998 then it is not working.

Sep 24, 2021 at 4:25 AM Notified 13 people

TOLOGIX - Infrastructure LawGuide (ILG)

dtSearch Implementation in ILG

Comments & Events