Wednesday, October 19, 2011

Athento has now a 98% accuracy rate auto-classifying documents

Last week, our engineers improved Athento's accuracy rate auto classifying documents, from 96% to 98%. Athento overtakes other solutions like ECM Captiva, Kofax Capture or Ephesoft with this new accuracy achievement in document recognition and classification.

For Yerbabuena that's only a figure, but for our clients, that means an increase on savings.

Let's see it with an example. Consider the following data

Daily Document Input
200 documents per day
Average time spent recognizing and classifying documents
5 minutes/document
16,67 work hours per day
Hourly rate of pay
10 dollars

Cost working with paper
Cost at 96% accuracy rate
Cost at 98% accuracy rate
Annual difference between rates (in dollars)
40.008 dollars/year
3199,9 dollars/year
1599,9 dollars/year
1,600 dollars/year

LikeUs Yerbabuena Software on LinkedIn Share

Monday, October 17, 2011

Athento helps you manage critical information in your emails

In the following post we'll tell you the most important things about managing critical information in your emails and how Athento can help you.”

Every day it's most frecuent that companies begin to worry about managing information in their emails. The AIIM's report, 'State of the ECM Industry 2010', says that managing of emails (such as records) is one of the top priorities for ECM nowadays, along with the implementation of a Records Management System and integration with multiple repositories.

Emails are a special type of content, they have records' priority because once they are sent and/or received they shouldn't be modified. On the other hand, emails are increasingly used by employees of companies in their daily work, the information contained in them begins to be critical in many cases. According to Radicati Group, the average corporate user sends and receives about 110 daily emails. Without question, with these data, it's not weird that different countries begin thinking in emails such as probatory documents. The regulations in this respect, at least in Spain are not very advanced and there isn't a description of the requirements applicable to an email, but they are now admitted at trial.

How does Athento IDM help you manage the information in your emails?

AthentoIDM and in general all Athento product family include a funcionality named 'PopMail Input'. This is really a component of Athento that helps us manage emails in two ways:

Uploading to the document manager by sending attachments to an email account. The platform receives the attached document and stores it doing its information indexing.
Documenting our emails: Input PopMail can convert a received email in a PDF document, including all information contained in it (sender, time, date, subject, body, etc..), identify a unique code and storing it in the manager not allowing further modifications.

Can you give me an example?

  • Example of use to manage pharmacological alerts in a Hospital Pharmacy: The Pharmacy of the hospitals typically receive alerts from Spanish Agency for Medicines and Health Products where information is reported about defects or potential health risks that may be caused by medications. The problem with this type of document is that they usually come in several ways to the pharmacy staff and as a result, the information is duplicated. With PopMail Input it's possible centralizing the receipt of alerts in a single account and evaluating which alerts are into the system already, so we don't end up re-creatig them again.
  • Example of use in processes of incidents and negotiations with customers: Imagine that someone in the sales department of your company reaches an agreement on price and terms with certain customers via email. For some reason, the commercial who got the agreement with the customer leaves the company and the customer is assigned to a new account manager. This new sales person does not have access to the emails from the last one, or simply among all too many emails it's really hard to find the email that closed the sale and service conditions. Fortunately, the previous sales person was a smart guy who sent the e-mail to the customer with a copy to an email account that allows PopMail. The new commercial will only have to look in the repository, where there's already a folder with the name of the client, all commercial offers that are made available to him/her and the client's responses. The recovery of a document in a repository is almost immediate.

We hope have explained a little bit more about this funcionality of Athento. However, if you have any questions, please, feel free to contact us.

LikeUs Yerbabuena Software on LinkedIn Share

Wednesday, October 12, 2011

ECM: Where is Enterprise Content Managemet Moving Towards?

"In the next post we'll see a review of current market characteristics and trends in ECM  for the coming years"

We can say that we are already in a mature market. While maturity has its degrees, it is directly proportional to the maturity of national economies, we can speak generally of a consolidated marketplace with well-defined characteristics.
Let's take a look.

Market Features

The ECM market is ...

Big: It is expected to hit $ 5.7 billion by 2014. 

With big fish: The ECM sea is also a sea of big fish. In more formal terms, we can say that the ECM market is an oligarchy market, in which the 3 most important brands have more than 50% of the pie. 

Difficult sales: The ECM market is not a shoe market. The companies do not consider changing a repository or buying it once a month, because in most cases, changing a content management system to another involves high costs and not only of money. Changin a document manager, or starting work with one first involves a change in how employees work every day.

Very similar solutions and tending to fragmentation: The solutions offered in the ECM field are often very similar. This, plus the small number of suppliers that dominate the market, has led the rest of the companies to have a focus on smaller portions and better defined ECM processes. For example, in  capture, this trend is evident. There are many solutions that focus only on this aspect of the ECM.

The result of these characteristics

Under these conditions, we have market companies in a race to the bottom trying to achieve:
  1. Differentiation with other providers.
  2. Facilitating customers the adoption of a content management system.
  3. Making their ECM systems integrate with other tools to offer more complete solutions.
  4. New outlets parallel or transverse to the well-established branches of ECM, document management, web content management, etc..

What these goals are leading to: Trends in ECM

Now we come to the trends. In seeking to achieve these goals mentioned that ensure survival in the market, we can see some trends are being consolidated in the market. We describe below the most important:

Adding more content to the word "Content": This is a clear departure from the market new ECM, which can be considered within the target number 5. Market companies are looking to manage emerging content types that have become relevant as information on the daily tasks of businesses, for example, content published on social networking sites, or emails. In the latter case, although we can not consider the use of e-mails in companies something new, we can speak of a growing interest in management control and in many cases to handle business critical information.

SaaS, reducing the requirements for implementing ECM platforms: We can frame this trend within the group of goals number 2. ECM providers now give its customers the opportunity to throw out of the cloud all maintenance weight. Software as a Service gives customers the opportunity to have a content management system without having to make a large investment in technology. SaaS is paid for the use of the ECM platform by the platform itself. The provider will maintain the ECM platform on its servers and the customer will only pay the bill for its use, as it does with electric or water bills. In 2010, according to AIIM, the rate of use of this mode increased from 2 to 6% and is expected to continue growing.

CMIS, making compatible the use of different repositories: Many companies, for various reasons, use multiple repositories. Until just over a year, this meant having to access information across multiple access points, and also that to the "not-centralization"problem, the problem of duplicate information was added in many casesIt was also common for companies to find problems when connecting the information from other systems with the content platform. These reasons made ​​it difficult to make the decision to buy a new ECM system even if you find one that offers better performance than the current one. CMIS is breaking the monogamous relationship between repositories and businesses and forcing suppliers to deliver improvements in their systems. We can say that CMIS is a response to goals 2 and 3, but at the same time is causing companies to develop ECM tools that strive much more to differentiate through product innovation, because the easier it is to migrate to other repositories, the harder it is to retain customers without offering anything new (ECM innovation). 

Partnerships between providers covering different processes within the ECM: It is clear that the ECM market conditions are forcing suppliers to offer more choices and convenience to customers. One of the drawbacks to save (fruit of the same dynamics of the market) is precisely the need to integrate different solutions to solve ECM problems and it's the clients themselves who have to carry the work of such integration. Providers have realized that it's much more efficient to do what they do best and partner with others than to address those aspects in which they are less strong. At this point, we can cite the example of the integration between traditional repositories and other capture solutions (Nuxeo + Ephesoft), workflow design (Alfresco + Activity, etc.). 

Innovation for survival: The ECM software suppliers know that their solutions must keep pace to the frantic globalized world in which their clients operate. It's not just solving the problem of excess paper or "Content Chaos" we need that customers perceive a higher value and products that fit their needs. An example of the constant innovation that is happening in the sector are the advances in the recognition and document analysis to ensure huge savings on manual tasks to customers. On the other hand, the growing demand for mobility has forced ECM providers the development of mobile clients that allow access from remote locations and availability of information without temporal or spatial barriers. 

So far this brief analysis of the ECM market. But certainly in the not too distant future we will have to revise this document and see how these trends have been completely consolidated and force us to look toward emerging branches (eg eDiscovery) to come to change a little more this so interesting market. 

As Athento, (could not finish the post without talking about our platform!) we try to always keep pace with the market, and even, as we've been doing so far, anticipating what is already being done.

  • Athento allows its use in SaaS or OnPremise
  • Athento implements CMIS and supplements it with other technologies that make it more interoperable.
  • Athento has a mobile client that allows access to different repositories from your Smartphone (iPhone or Android).
  • Athento can manage e-mails through its PopMail module, making the content of emails  manageable and able to classify automatically.
  • Athento is the first document management technology platform that added semantics in the process of capture and retrieval of content.

LikeUs Yerbabuena Software on LinkedIn Share

Monday, October 3, 2011

Use example of Autotagging in Athento IDM: From 8 minutes to 8 seconds extracting important information from a Resume

It is not the first time we told you something about our labeling module. For those who, however, did not know what this is about the labeling, as we name it shortly, is one of the features included in our solution for enterprise content management, Athento IDM, an intelligent document management solution. Basically, this tool works together with OCR to extract the keywords of a document that can be used as labels (tags) and to help us find documents from a tag cloud. It is a way to create quick access to documents that share a certain theme.

Is that all? Yes, simple right? Simple, but extremely useful, we will explain with an example, which is the best way to understand how we can take advantage of something. Putting it into USE!

Use the case of a temporary agency (employment agency) such as Manpower, or any company (virtual or physical) or HR department that is dedicated to providing companies with qualified personnel to fill vacancies.

Such companies often receive curriculums (or resumes to our American readers) in paper or digital files. Some, especially those that are web based, make the candidate fill in the data that the application will need to relate to applicants with vacancies through their qualifications. However, they still leave the possibility for the candidate to attach your own resume as a file because they know much more information would be on the resumes than what can be collected through the inflexibility of web forms.

Either way, getting important information, even if it's external users who carry out the process, it remains a manual, tedious and long process.

For example, fill the first form in the famous (in Europe) InfoJobs portal takes an average user accustomed to the web an average of 2 minutes (the form only collects information about your account to be created) and the user still has at least 3 major sections to fill (Studies, Experiences and Future Use). At the very least the total process will take 8 minutes a user.

Americans (who know a lot about web usability and many other topics) know that the time is long enough to lose many users. LinkedIn is a wonderful example of how we can help reduce the time a user takes to complete its resume. LinkedIn offers users the ability to upload a resume in PDF, Microsoft Word or other formats to complete their profiles. The application extracts data from the resume and adds it to the content of the user profile. We will not get to study the effectiveness of this particular tool, let's just say that in most cases it provides help to complete a resume.

In the case of the employment agency and Human Resources departments it is even more common that the process of extracting information from resumes in paper or digital format needs to be made by an employee.

If for example 50 resumes received daily by any route (via e-mail attachments, paper, included in a created profile, etc) and assuming that extracting important information from a resume take an employee the same as it would take a user in a job portal, we are talking about a little over 6.5 hours daily consumed in the process.

And when we want to find someone to fill a position? Companies with the digitized data of the candidates have it a little easier, your applications should offer a way to query the database and cross position requirements with user skills. However, we would have the problem that in many cases the most comprehensive information is found in the resume of the users attached as files to a profile. In companies where the curriculum is still handled on paper, someone will have to review these documents one by one to find out if they meet a certain requirement or not.

So as to show two problems: obtaining information from resumes is still a manual job that takes up too much time and quick and accurate access to those candidates who possess some knowledge or skill is not an efficient process (and sometimes not even useful). Let us study now how someone with Athento iDM could dramatically improve both processes using OCR and autotagging modules. Let's see it step by step.

1. Obtaining and indexing the entire content of the resume
Through its OCR engine (Tesseract) Athento extracts data within files that are images (TIFF, PNG, PDF, DOC, XLS, GIF, JPEG). Extracting data from other text documents ( .doc, .odt, etc.), not being images, has no problems either and is fast. This process is almost immediate (takes a few seconds per document) and the best, is transparent to the user all you have to do is upload a file to the repository (either a resume scanning, emailing, add it through WebDav -drag and drop-, etc..). From 8 minutes to get all the data of a resume to no more than 5 seconds. The OCR used in Athento has an average success rate in data extraction of 96%.

2. Generating Labels
Athento iDM uses its Autottaging module to search inside the indexed content the most relevant words. These words will become labels that will gather all documents containing it. For example, in a programmer's resume the word JAVA is relevant. It is important to note that a document contains many words such as articles, prepositions, etc.. These words have no relevance. If we group together in the same category, label or tag all the documents that contain for example the word "by", the group will surely contain all the documents in the repository, so we do not do any good here... Thus we see something we call "Document Intelligence" since Athento can reason about what terms are relevant or not within the content.

3.Searching Content by tags
Following the example of the word JAVA in a resume of a programmer, clicking inside this tag in our tag cloud would get all CVs of developers that have included the programming language in their skills and knowledge. Surely, we would also have also within our tag cloud the tag "programmer" that would give us access to all programmers who have a curriculum with just one click. The search for candidates with particular knowledge would be reduced later to a simple click by the user on a label. As an added bonus, Athento would offer a link to Wikipedia for each tag in the system, in case we want to know what every label means.

With this example we've seen how Athento iDM reduces from 8 minutes to 5 seconds the information extraction contained in a resume and turns the manual search process into an automatic process or search using a form (which is commonly offered by applications) to that of one click. We hope that the example has proved illuminating.

LikeUs Yerbabuena Software on LinkedIn