ILG Document Collection Upload and HTML Coding
Hi
Ketan
,
We had a Tologix team call earlier, and the ILG document collection is finalized and ready for integration into the application. We now need to start the following processes:
We had a Tologix team call earlier, and the ILG document collection is finalized and ready for integration into the application. We now need to start the following processes:
- Batch upload all project and document data from the ILG Project and Document Master List: https://islg.egnyte.com/dl/sfYwQWYdxo.
is making some final changes to the list this afternoon, but the project and document list is finalized and is ready for integration into the application. Note that similar to ISLG, we will be continuing to update the ILG Master List with new content as projects and documents are released in the months ahead.
Katrina
- HTML code all the documents. Similar to ISLG, we need to perform PDF Quality Analysis on all documents identified in the master list (2990 documents), and determine which documents will required manual PDF coding and documents that can processed and converted using the automated PDF to HTML converter.
Let's discuss how we should approach these tasks during our call tomorrow, and then we can discuss these more broadly with the team during our call on Thursday morning. Depending on the context of our conversation, I may invite
Katrina
and
Paul
to join the call on Thursday morning as well (8:15 am Vancouver time).
Thanks,
Morgan
Thanks,
Morgan
I have discussed with
1, Before we starting development of Batch process, we need to discuss so please provide your time to discuss for same.
2, We can start PDF quality report as per we have done in ISLG, so can you please provide list of documents ( Source files ) in order to generate PDF quality report.
Hoping to get source of documents today so we will plan to start as soon as possible.
Thanks,
Jitesh
How about we do a call on Friday at 8:15am Vancouver time? I'd
Here is a link to the source documents: https://islg.egnyte.com/fl/TmxP3c15v8.
Thanks,
Morgan
Ok, Friday at 8:15 am Vancouver time is fine for us. We will download documents from above provided url and will start to generate PDF quality report.
Thanks,
Jitesh
Morgan
Here is the recording of the meeting earlier today:
Thanks,
Morgan
An issue we didn't discuss during the meeting was where we are uploading all the data and documents. Because we still are working on the SME trial. My suggestion is that we upload all http://dev.infrastructurelawguide.com/, and then sync the data with http://www.infrastructurelawguide.com/ after the SME trial is complete.
What are your thoughts?
Thanks,
Morgan
Yes, we will first upload data and document on Dev than we will sync into production.
Thanks,
Jitesh
Morgan
Please find attachment of ILG PDF Quality Report. We have added one column for UIN number as per your need. We have generated first 1444 PDF files quality report and remaining files quality report is in process.
Thanks,
Jitesh
I'll review this with
Thanks,
Morgan
I have proceed all ILG documents to generate PDF quality report. We are getting following types of files after proceed all files.
Please find attachment of PDF quality report, you can find all types of files in this report.
Morgan
Hope this is okay but I took a look at the issues and resolved most of the duplicate UIN number issues. The only project that I didn't fix was the Presido Parkway as that was one of the initial prototype projects.
As a heads up for the Royal Inland Hospital (Sr. No. 1839), I had to update the UIN numbers for documents BC/0018/0043 - BC/0018/0047.
In the meantime, let us know if you have a questions or issues concerning the uploading of the data from the ILG Project and Document Master List.
Thanks,
Morgan
In ILG documents, we have need to confirm following cases from you whether it will need to convert html when converting manual conversion.
1, Form : We are getting form in PDF UIN number AB-0013-0005, in that PDF contains form so we need to generate or display [REDACTED], I am asking because form content checkbox, field and underline for input so please confirm.
2, Paragraph bullet not exist content: Ex . UIN : AB-0013-0022
In this case, Many PDF contains bullet without content so what we should display there for content. It should be empty or some REDACTED need to display in this cases. Please confirm.
3, Some Text found written manually. UIN : BC-0001-0001
We have found extra text written by pen, so we need to consider those text or not. Please confirm.
Thanks,
Jitesh
Could you please re-post the screenshots above, the images are extremely small and it's difficult to see.
Thanks,
Morgan
I have re-attached screenshots, Please provide your feedback.
1, Form
2, Paragraph Bullet without text
3, Some Text found written manually.
Thanks,
Jitesh
Please deal with the issues above as follows:
1. Forms: Please represent the forms as best as possible, but check-boxes can be represented as bullets.
2. Paragraph Bullet without text: It's difficult to understand exactly what's going on here, but please include the bullet number once with [REDACTED] afterwards (s.25 [REDACTED]).
3. Some Text found written manually: disregard all written text.
Thanks,
Morgan
Before We start data upload activity, Can you please tell me all existing data related to Project and document should be remain as it is or removed and insert fresh data from batch process as per spreadsheet provided? like Meta fields values, List etc.
I have found dropdown list prepared by
Thanks,
Jitesh
Morgan
Thanks, We will wipe all data from Dev and insert fresh data as per spreadsheet.
Thanks,
Jitesh
Ok. I see what you mean. There appears to be a problem with the US States and Territories list that is causing the problem. To resolved it, I suggest wiping these list, and uploading fresh versions. Here is the data you can use for the lists:
The rest of the lists will be populated by the applicable entries within the master list that Katrina will be providing later today.
Thanks,
Morgan
My apologies for the delay but I have updated and finalized the master list.
Thanks,
Katrina
Thanks for updating finalized the master list and we will consider attached spreadsheet by
Project Document:
1, For Document, We have not found any Meta fields added in application based on spreadsheet provided in order to identify field type ( dropdown, textbox, etc )
2, I think, following fields not need to consider when uploading data into database for project document. Please confirm.
I am request you to please mention only those column which are need to consider for data upload into database. Also please create Meta fields for project document in the application.
Project :
In application, there is field set "Contractor(s)/Subcontractor(s) (Role)" contains three Meta fields like D&B contractor, D&B Design and D&B construction but in spreadsheet contain column name "Contractor(s)/Subcontractor(s) (Role)" and value contains with Semicolon ( ; ) separator and it found four (4) types of values there but you have only created three (3) meta fields so where to do add value for "O/M contractor".
Second, I am suggesting please create separate column for "Contractor(s)/Subcontractor(s) (Role)" values as per you have did for industry, like Industry sector , Industry sub sector you have created separate column in spreadsheet and in application all exists in "Industry" field set. Please follow same "Contractor(s)/Subcontractor(s) (Role)" so we can do proper batch processing.
For example
In application :
In Spreadsheet :
Please provide your feedback.
Thanks,
Jitesh
Here is updated version of the master list:
We have removed all columns that are not relevant to meta fields within the application (except column F (Link to Document) in Project Documents in case this information is needed).
For the issue concerning the Contractor(s)/Subcontractor(s) (Role) fields, we have reorganized the master list separating the entries into the respective fields:
Also, I adjusted the fields in dev.infrastructurelawguide.com so that all the Contractor(s)/Subcontractor(s) (Role) fields pull from the same Contractor(s)/Subcontractor(s) List. This means that that same values should be used to populate all four fields, and that duplicate values between fields should be consolidated. For example, you'll see the value "EllisDon" appear in the "D&B Construction", "D&B Contractor(s) and "O&M Contractor". These are all referring to the same "EllisDon" value.
I also added the "External Sources" field to the application, which will be populated by column AL (External Sources) in the Project tab. Note that multiple URLs are separated by ";" and should be treated as separate values. Also, we did not include data for the "Display Text" field. I assume if this isn't included in the URL field, it will just display the URL.
Thanks,
Morgan
I writing to let you know that we discovered an issue with a number of documents in the master list above that are currently being reviewed by
We have discovered that some documents are very large (500+ pages and more than 100MB). The large size is caused by the consolidation of project agreements and schedules into one document. We have decided to subdivide these documents into separate documents, and integrate them as separate entries in the master list. Note that this may affect some of the documents that are already undergoing manual PDF to HTML coding.
Thanks,
Morgan
We are doing final inserting data upload for Project and project document and verifying in local system and we have following question so please provide your feedback.
1, Can you please tell me what should be PDF name in project document? you have not provided PDFName column.
2, When inserting project document data which status should be for document. I think, it should be "Edit in Progress" and user must be upload PDF and HTML in future in order to see/update document data in application.
Thanks,
Jitesh
Just to clarify, when you say that the PDFName column is missing are you referring to the Master List or the ILG HTML Manual Coding spreadsheet?
Thanks,
Katrina
No, i am taking about data upload of project and project documents.
Thanks,
Jitesh
You will have to retrieve the PDF filenames based on the provided Egnyte URL links. At the same time, the UINs are contained in the filenames, so you could use that to identify the filenames as well.
I'd like all the data and documents published and visible of the subscriber side of the application. Please have them uploaded in the Published state and then we will selectively edit documents that require tagging.
Also, just confirming that data and documents are getting uploaded to dev.infrastructurelawguide.com. Please do not make any alterations to www.infrastructurelawguide.com.
Thanks,
Morgan
We are doing data upload using batch process and it is automated process and you have provided link for documents it seems like do manually open each link than take file name and we are seeing there are approximately 3000 around documents so can you please suggest or provide to take all documents name in order to complete data upload process.
Thanks,
Jitesh
As I mentioned above, all the documents have the UIN in the filename. My suggestion is that you use this to match the UIN in the master list. This should be easily done with Excel once you populate the filenames into the master list and run a few functions.
Thanks,
Morgan
once you populate the filenames into the master list
That is i am talking about, it will take long process to populate filesname into excel sheet from url. Can you please provide other way or from some where to take all filesname rather than to open url.
Thanks,
Jitesh
Can't you take the filenames from the applicable folders in Egnyte and then match them with the UIN?
Thanks,
Morgan
I will add in the filenames and upload a new version of the master list shortly.
Thank you,
Katrina
I have updated the master list to include a column with the pdf file names. Please let me know if you have any comments or questions.
Thank you,
Katrina
Morgan
We have executed batch process and data uploaded on dev site (http://dev.infrastructurelawguide.com), Please check it and provide your feedback.
Thanks,
Jitesh
Data upload has been done on dev site, Can you please tell me everything is fine? so we can do same process on live site (http://infrastructurelawguide.com/Admin)
Thanks,
Jitesh
I'll be take a closer look when I'm back from holidays next week.
Note we will not be updating the live site until the SME trial is complete.
Thanks,
Morgan
I've taken a look at the development site and had some trouble opening the original pdfs. Sometimes the documents will load but more often than not I receive the following error message:
Thanks,
Katrina
Could you report this as a to-do in the following group: ILG Testing Feedback - Fix Now - TOLOGIX - Infrastructure LawGuide (ILG) for Tologix and assign the task to
Thanks,
Morgan
I have added this task to the the to-do list as a priority item.
Thanks,
Katrina
I'll give you a call later today to catch up.
Thanks,
Morgan
As per discussion with
https://docs.google.com/spreadsheets/d/1nI-SZFYaOAfSFU0eyrOmDHmNdFseppHOwC5DoRyR5Og/edit#gid=597349052
Actually we have already proceed ILG files those are poor quality and you have provided first slot of ILG files from ILG PDF quality list which you have provided earlier. your subdivided ( big files to smaller ) files and source of those files we are not aware, Can you please provide more information so we can correct and proceed files accordingly.
Thanks,
Jitesh
Please find attached the revised version of the master list.
Thanks,
Katrina
Please find attached an updated version of the master list. I am re-uploading it in this thread so it is easily accessible for everyone. This master list has been revised to corrected the technical advisors issue and to include placeholder documents.
Thanks,
Katrina