- Some documents at least in doc, docx, odt, odf and pdf formats
- In each format at least some documents with images inside of each of the format we aim to support
- In each format at least some documents with images that contain text
- Depending on the search and displace operations we need to make sure that the test documents contain at least some text that fits the operation and text that might break the operation
- Each operation would need to define the before and after requirements in test documents
- Documents in test suite need to be complete pairs i.e. docs in the “before” state and the desired “after” state - QA process is to diff the docs at MD stage (checks tags and tag operations have completed successfully) and visually at the final output stage
- We’d need a tool to highlight differences at end of test between result and expected result, so not only to assert of the test was successful but also to help identify what went wrong.
Not sure I’d put the image stuff at this stage, that belongs with OCR - we could gather test docs with images, but not see them as something we would be checking yet.
Yes, I think the image OCR could be added in the next stages.