For historical reasons the banking group still used for the application environment that we are dealing here with a proprietary Document Management System, IBM Content Manager OnDemand version 7.1, configured to use a repository-based storage system with magnetic tape.
The main problem was that while the metadata associated with each document were contained in an Oracle database and this enabled the document search through the metadata, this not very efficient since only certain metadata were indexed, it was impossible to search for the content of the documents themselves.
This problem was further aggravated taking into account some of the figures from the repository:
- 4 separate banks
- 22,000 classes of documents
- History of several years (depending on the type of document this could be of over 10 years)
- The number of pages per document varies from 1-2 up to hundreds.
- 160,000 new documents per month approximately
To try to solve the problem the client had decided years ago to implement the following search system: a first search was performed by defining the range of documents on which to search using certain metadata (actually the metadata had basically 2 available values: the date of the document and type of it) once we had a defined range of documents, the user would enter the term to search the contents of the documents and proceeded to download ONE BY ONE the documents in order to search in the text for the term. Remember that some of them (the oldest) were stored on magnetic tape so the recovery time for the physical document was very high, even at 10 minutes per document (the robot had to mount the magnetic tape to virtual disks subsequently to download the documents). Readers can easily imagine that the amount of time spent by users to perform a search was unbearable.
Once the client decided to seek a real solution, it first contacted the manufacturer of the ECM (IBM), and they proposed a new product called IBM OmniFind Enterprise Edition, obviously with the payment of new licenses, not just cheap, which was in beta (in fact the banking group would have been the first customer in the world) with little evidence of actual production environments, still under development and with an implementation time of several months. The integration of the product with the current application should be done by my team.
When my company understood that the hypothetical scenario that arose was, from a technical standpoint, unattractive because of the immaturity of the product and, from the sales perspective, nothing fancy because of the fact of strengthening a direct competitor (IBM also offered consulting and other services in addition to providing products) we decided to take a proactive part in the decision the client had to take.
The strategy to prepare was clear and precise: the development team I led had to give quick and accurate responses, preparing the presentation of a pilot that would satisfy the needs of the banking group and in a record time.
Customer needs were basically to have a tool to undertake the discharge of the documents, asynchronous and background indexing of them and providing users the ability to perform full-text search on the content of the above documents, all integrated in a distributed architecture with high availability that had to interoperate with environments such as .Net and Java. We decided that the best tool we had available was the combination of Solr / Lucene open source libraries in Java with a huge developer community behind it, and furthermore, being Java, coincided with the strategic decision that had recently adopted the bank to migrate most of their systems to the Java world, regardless of Microsoft.
The main difficulties for the team were the small space of time with which we had to present a better pilot before the competition and limited availability of human resources (actually there were 2 people, one part-time in the project) compared with the magnitude of the IBM center of excellence in Germany.But we had 3 weapons that could match any colossus: the maximum knowledge of customers and their systems, speed of reaction of a small team and empathized the most with the client and, above all, the challenge posed to us the challenge of beating a giant in his own field.
They were hard but very satisfying days: working on something new and exciting. I'm not sure if it was more because of the fact that we thought of it as a challenge to win or because we were fully convinced that with our solution we were going to completely change, from the point of view of usability, the document management system that had existed in the banking group over the past 20 years.Now that some time has passed I think the combination of these two factors were the food that gave the strength necessary to arm the slingshot wielding David.
I will not relate the headaches and the number of hours of our spare time we used, among other things because I do not have good memory for these things, but I will never forget what the face of the customer was when we performed in the office of head of the Department of Architecture and Distributed Systems, after about 10 days with a demo that did exactly what they had spent years wishing and which was already integrated with their systems ..
Meanwhile somewhere in Germany our competitors were still trying to launch a beta version of its (expensive) product ...