Posted by Morgan Maguire · May 31, 2019 at 10:13 PM

New Search Engine

Hello all,

I would like to raise an issue to discuss at the next status meeting (scheduled for Thursday, June 6th). Currently, we rely on dtSearch as our search engine for performing keyword searches across PDF documents and SQL databases. However, with the rebuilt ISLG and ILG, we will be eliminating the need to search through PDF documents (which I understand is the primary reason we rely on dtSearch), and I want explore the opportunity of using other search engines that provide more sophisticated ways of searching through SQL databases. In particular, I would like us develop a search tools that allows us to manipulate the search algorithm, and customise it to influence search result rankings.

For example, currently with the Full Text Search on ISLG, we are limited to ranking results by the number of hits for a searched keyword. However, I would like to introduce other factors that will determine result rankings (e.g., not only keyword frequency, but also density, prominence and proximity within the text of a document). In addition, I would like to give us the ability to influence relevancy based on other related data (e.g., the frequency a document is referred to by other documents through the Citators).

Devaang

and

Jitesh

, further to our previous discussions with Machine Learning R&D team, the R&D is starting to produce tangible results that may be used for these types of searches. Could you please add a comment below updating everyone on the progress, and how soon we'll be able to integrate these searches for the rebuilt ISLG and ILG.

In addition, (assuming the ML team is not able to deliver a satisfactory search tool). I would like to start exploring other search engines (e.g., Eslatic: https://www.elastic.co/), and flesh out the pros/cons of each option.

Mitch

Kevin

and

Ryan

, it would be great to get Industrial's input on these issues.

Thanks,

Morgan

Comments & Events

Ryan Knuth, Customer Support Manager

Morgan

Unfortunately, I'm out of the office on Thursday, so I will miss the meeting, but Elastic is the system we typically deploy in Drupal projects when more complex search is required, but we have also looked into Apache Solr in the past.

We'd need to do more research in the context of ISLG and the requirements, but both I think would be good options to consider.

Thanks!

Ryan

Jun 03, 2019 at 7:54 PM Notified 12 people

Morgan Maguire, CEO

Ok. Sounds good

Ryan

Mitch

, could you attend the call in

Ryan

's absence? I'd like to get the conversation going on this issue, and I'd like to get as much input at possible from everyone.

Thanks,

Morgan

Jun 03, 2019 at 8:40 PM Notified 12 people

Mitch Doyle

Morgan

I'll be there Thursday to listen to the requirements and get more details on the issue. Thanks.

Jun 04, 2019 at 12:49 PM Notified 12 people

Morgan Maguire, CEO

Great. Thanks

Mitch

. Look forward to discussing things on Thursday.

Jitesh

and

Devaang

, as requested above, please provide your comments below in advance of the call on Thursday to give everyone an opportunity to examine the proposed options.

Thanks,

Morgan

Jun 04, 2019 at 2:39 PM Notified 12 people

Jitesh Dhuravala

Morgan

,

I know you are eagerly waiting for our update on ML Project. Actually Dhrumil and Hiran has more idea about that but I have asked them progress. They have finished the ML things and that output they have consumed manually in elastic search so They told me they are now atomizing the things. If you want to know further then please contact them in our ML Project Card or add them for more clarification so that they can provide you better information about that. And In Thursday call they will explain you what they did and what they have to do to finish this first POC.

Thanks,
Jitesh

Jun 05, 2019 at 6:58 AM Notified 12 people

Morgan Maguire, CEO

Jitesh

,

Sounds good. I've added

Dhrumil

and

Hiran

to this project so that we can consolidate discussions here, and move things more into the context of implementation within ISLG and ILG.

Dhrumil

and

Hiran

, in advance of the call tomorrow, would it be possible for you to provide a summary of the work done so far, and how you suggest (taking into context requirements above) we implement the tools you're building for the front-end searches in ISLG and ILG.

Thanks,

Morgan

Jun 05, 2019 at 4:35 PM Notified 14 people

Devaang Bhatt

Hi Morgan,

Apologies for stepping-in before Dhrumil or Hiran could respond. But, Dhrumil has prepared a presentation which we will all review tomorrow on GTM screen share. The PoC is complete with some only cleaning-up items remaining. We will also see the Demo prototype in action after the presentation.

It may not make any sense for a summary right now as the context may be lost or misinterpreted. I am sure, we all will have greater clarity after the presentation and the demo tomorrow.

Look forward to our interaction tomorrow.

Best regards,
Devaang Bhatt | AVP, International Business
Microsoft Specialist, MCP

Jun 05, 2019 at 4:55 PM Notified 14 people

Morgan Maguire, CEO

OK. Sounds good

Devaang

. However, would it be possible for

Dhrumil

to post of copies of the slides in advance so that I can give them a review, and be able to refer back to them afterwards.

I'd like to ensure everyone is up to speed in advance of the presentation, so that we can ask informed questions.

Thanks,

Morgan

Jun 05, 2019 at 5:01 PM Notified 14 people

Devaang Bhatt

Hi Morgan,

Here’s the PPTX per your request. Please review and we can together have a comprehensive conversation tomorrow.

ISLG presentation.pptx 889 KB • Download

ATT00001.htm 21.4 KB • Download

Jun 05, 2019 at 6:43 PM Notified 14 people

Morgan Maguire, CEO

Great. Thank you

Devaang

.

Morgan

Jun 05, 2019 at 6:44 PM Notified 14 people

Dhrumil Shah

Morgan

,

As per our last call discussion I have set up the environment for your testing in our Server with taking 30 sample PDF for our AzureML base Advance Search. Right now This search is based on frequency we can take it to the next level as you want in the upcoming phase.

Here is the link to the Website. You can give this link to your SME to test this.

http://tologix.demo.wwhnetwork.net/

If you want a reference of the 30 PDF which we have consumed to produce results here I have attached those PDF. Let me know in case of any concern and provide us your valuable feedback so that our team can move further.

30SamplePDF.zip 48.1 MB • Download

Jul 03, 2019 at 11:36 AM Notified 14 people

Morgan Maguire, CEO

Thanks

Dhrumil

. This looks very interesting.

Paul

and

Nafiseh

, could you take a look and let me know what you think about the results. Note that what we're looking for is relevancy of the suggested keywords presented by the tool.

Thanks,

Morgan

Jul 03, 2019 at 11:24 PM Notified 14 people

Morgan Maguire, CEO

Dhrumil

Paul

and

Nafiseh

are putting together a report concerning their feedback on the demo auto-suggest search above, and they have requested a copy of the excel spreadsheet that contains all the keywords that were extracted from the 30 PDF files (see slide 16 of your presentation above on Next deliverable). Could you please provide us with that spreadsheet?

These keywords will help inform their assessment on whether the demo is providing relevant and accurate results. It may also help us understand why connector words like "and" and "or" are currently omitted from the results.

Thanks,

Morgan

Jul 08, 2019 at 11:10 PM Notified 15 people

Dhrumil Shah

Morgan

,

I have attached the "CSV" file that will give you the keyword & Phrases list those we have consumed for the auto-suggest. Regarding your concern about the "and" and "or" logic. Right now Auto-suggester works like google search first it will show you the best match values of given words so if you have written more than one word first it will going to found the phrases which has covered all the words those you have entered and if result not found then it will check with different word combination and eliminating more than one word and gives you the best match result.

And Yes for your client perspective if it's attracting your client more to give the options like "and" and "or" operator than we can also give that option like Google is giving in its advanced search.

Tologix Demo _Predictive Exp._ - 506153734175476c4f62416c57734963.faa6ba63383c4086ba587abf26b85814.v1-default-1752 - Results dataset (3).csv 75.3 KB • Download

Jul 09, 2019 at 6:45 AM Notified 15 people

Morgan Maguire, CEO

Dhrumil

,

Thank you for the keyword & phrase list. We'll follow-up with a more detailed report in the days ahead, but it is curious that no keyword or phrases in the list includes phrases with connector words like "fair and equitable treatment", "tantamount to expropriation" or "abuse of process". This is important, because these phrases are crucial concepts that appear frequently throughout these sample documents, and would be expected to appear from a client perspective. To give you some context, here are suggested resulted in Google for the following:

image.png 21.5 KB • Download

image.png 24.9 KB • Download

image.png 23.4 KB • Download

My expectation is the auto-suggest search above would produce similar suggested phrases that include the connector words.

Thanks,

Morgan

Jul 09, 2019 at 9:15 PM Notified 15 people

Dhrumil Shah

Morgan

,

Yes, you are right, Right now connector words are missing but we can add it. Just Do one thing, please note down all the things and give me the list then I will find the fact and will let you know about all.

Jul 10, 2019 at 5:11 AM Notified 15 people

Morgan Maguire, CEO

Great. Thanks

Dhrumil

. Yes, we'll summarize everything in the report.

Morgan

Jul 10, 2019 at 2:49 PM Notified 15 people

Morgan Maguire, CEO

Dhrumil

,

Below is a report summarizing our findings of our testing of the auto-suggest prototype. Please let me know if you have any questions. Also, we're having a team call with

Jitesh

Harsh

and the team at Industrial on Thursday. Feel free to join if you want to discuss anything in more detail.

Thanks,

Morgan

Report on ML Auto-Suggest Prototype (17-Jul-19).docx 1.68 MB • Download

Jul 17, 2019 at 8:30 PM Notified 14 people

Dhrumil Shah

Morgan

,

Thanks for giving us your valuable feedback, We will study this and let you know for each and everything those you have mentioned in a report.

Jul 18, 2019 at 8:21 AM Notified 14 people

Dhrumil Shah

Morgan

,

Here I have attached our Progress report. Right now we are facing the problem with 'and, or, of ..etc' word. if we are including these terms then phrases that are generated are not accurate. I have given the example in the attachment for your reference.

Right now our team is working on generating accurate phrases with the inclusion of 'and/or/of'.
Simultaneously, our team is also looking to get the same result that Contegra is right now providing you.

Please look into the progress report and let me know in case of any concerns.

TologixProgressReport.docx 60 KB • Download

Aug 01, 2019 at 1:49 PM Notified 15 people

Morgan Maguire, CEO

Thanks for the update

Dhrumil

. I don't have any specific concerns at this point, but let us know when an update demo is ready for further testing. Also, if you need direction on what documents to use to expand the sample, please let us know.

Morgan

Aug 01, 2019 at 2:04 PM Notified 15 people

Dhrumil Shah

Okay Thanks

Morgan

,

I will give you update on this once we are ready for the next demo.

Aug 01, 2019 at 2:22 PM Notified 15 people

Morgan Maguire, CEO

Great. Thanks

Dhrumil

.

Morgan

Aug 01, 2019 at 2:26 PM Notified 15 people

Dhrumil Shah

Morgan

,

Right now we have implemented the logic by which we can have the stop words like 'and','or','of' and generated the phrases but we still have to find how much accurate it is so once we have that data then we will able to give you update on this.

Aug 22, 2019 at 2:42 PM Notified 15 people

Morgan Maguire, CEO

Sounds good

Dhrumil

. Thanks for the update.

Morgan

Aug 22, 2019 at 2:48 PM Notified 15 people