Wednesday, November 27, 2013

Improvements in automatic extraction of barcodes and other codes

Athento continues to evolve, day by day, with new improvements and new functionalities.

This time, we’ve incorporated automatic extraction of bar codes, QR codes, and many other types of codes commonly found in documents.

This way, when you upload a new document to Athento, all of the codes contained in the document will be identified and the metadata extracted (“Codes” section). This identification happens without having to tell Athento where the codes appear. The system simply searches for any code present within the document.

Automatic extraction of codes

In previous versions of Athento, when we wanted to extract the information encrypted in a bar code, we had to previously define where the code appeared in the associated model.

Defining the extraction template during the creation of the model

Nonetheless, in this case, automatic extraction conducts the process without needing any models. In other words, any time there’s a code in a document, it’ll get extracted.

However, if we want the information in this code to be a piece of metadata whose information we can export, we do need to define it in the model.  The good thing about having these two ways is that if there’s already a model that’s been associated with the uploaded document, the data associated with the barcodes, QR codes and other types of codes will be extracted by both methods, allowing the user to compare and contrast the information.

Automatic extraction + Extraction using the model

To try the new functionality, try our online demo!
In the Athento User Guide, you’ll find more information about automatic data extraction using codes.


Monday, November 25, 2013

Coding in ECM: in danger of dying out?

As I was surfing the web the other day, I came across a really interesting post written by Susanth Kurunthil, a consultant working in Enterprise Content Management (ECM), who has fifteen years of experience working with technologies like Filenet, Alfresco, SharePoint, Kofax and Captiva. The title of the post was pretty striking:

"ECM – don’t need developers anymore.” 

The title compelled me to read the post. How could this guy SAY that? That’s the first thing I thought. Kurunthil’s position is that the ECM market is heading towards a scenario where developers aren’t wanted on staff – not because of having an ECM solution that covers the business’s needs, but because what companies want are flexible, powerful applications that can be configured (and not by writing code): that’s the way the tool should cover the business’s needs.

Truth is, this trend isn’t only limited to ECM. It’s called specialization, and it means each business dedicating itself to its core business: the areas in which businesses excel. After all, if we’re a bank, why do we need five developers working on an ECM application that I can purchase, and have fewer complications?

I think that that’s the point Kurunthil is trying to make, although he goes further than that. Kurunthil is saying that this type of “team of developers” is becoming an extinct species. And not only do the developers know it, they react to the threat, as a result.

To provide an example, Kurunthil tells the story of a large petroleum company in the Middle East. The company had a team of developers building the company’s ECM solution, which was based on a well-known ECM platform, themselves. What the team did was to fill the original solution with a load of patches; and, naturally, it didn’t take long for the other users in the company to realize that the platform was a nightmare to use. Users had to carry out too much manual work, the application was always failing and it became a pain the neck for the rest of the staff.

The IT director at the time brought in a project manager who had a ton of experience and who could make everyone’s lives a lot easier. True to form, the project manager was a wizard and, within a year, had created a complete document management system which would solve the petroleum company’s headaches. Not only that: the engineer had a five-year road map for the ECM completely planned!

And they all lived happily ever after…everyone but, of course, the team of developers, who saw their heads on the chopping block. They didn’t just stand there, twiddling their thumbs: they convinced the IT director to axe the project manager, saying that it wasn’t necessary to have that many people, and they could do it all themselves. And that, my friends, is how a brilliant framework planned and implemented by the project manager, which made the users in the company happy, ended up in the dump.

Personally, I don’t believe that all stories have to have endings that are that sad, or that anywhere there are developers, things work that badly. What I do agree with is that, more and more, businesses are demanding applications that don’t involve long, tortuous implementations. They’re demanding the exact opposite: out of the box applications that work well.

What do you folks think?


Thursday, November 21, 2013

Enterprise Content Management and the Management of Content Life Cycle (Part 1)

Today, we’re going to get into a conceptual debate and explain a bit more about the main concepts on which the ECM universe is founded. 

The first fundamental concept to understand is that of the content life cycle. Documents and other content/digital assets have a life cycle within organizations. This life cycle begins when new documents are created or received, and ends when they are finally destroyed or permanently stored. A life cycle tends to be defined as the different states of publication of a document’s content, although it’s really a deeper concept that also involves the way in which users interact with the content (if it’s only retained for legal requirements, if it’s being consulted on a regular basis, or, in contrast, if the document is still active, etc.) 

Proper management of content over all the phases guarantees proper storage of the business’s information, and the capabilities of companies to exploit this information. 
The discipline that works with this life cycle is called Enterprise Content Management (ECM), a discipline which takes in many other disciplines that, traditionally, have been chosen separately for managing digital business content in each one of its phases. These phases are known as Capture, Manage, Store, Preserve and Deliver: 

*Poster ECM-101 / AIIM - Bryant Duhon

Capture: Addresses how documents get into the information system. You can call the capture “smart” when information is obtained automatically from the documents going into the system.  

Manage: Means the movement and the circulation of the documents, as well as being able to use the information contained in them.

Store: This phase refers to where the documents or digital content are stored, and our ability to get them back. 

Preserve: Conserving and keeping the digital content over the long term. 

Deliver: This phase works with the integration of the ECM system with the other business applications or the business’s entire information system. These are the available mechanisms to get the content to the people who need it, through the appropriate channels. 

In the next post, we’ll see how each one of these disciplines uses different software tools, as well as the use of an endless number of different technologies. 


Tuesday, November 19, 2013

Changes in the Capture Life Cycle in Athento

In document management, we typically use the term “life cycle” of a document to talk about those states that a document can go through, from the moment it’s generated right up until it’s been totally processed.

Up until now, in our Capture module, there were two possible available states in in the life cycle of a document: “Reviewing” and “Validated”. Now, you can also put a document into a “Recorded” state. 

Let’s take a brief look at what each of those states mean:

This state shows that the document has been uploaded to the system, it has been processed and its metadata have been extracted, but it still hasn’t been manually revised by a person. 

When a document appears as “Validated” in Athento, that means that a person has accessed the document, has confirmed that the extracted data are correct, and has validated it manually (the “Validate Document” button). 

This third state represents something more abstract. It’s the state in which a document, after being validated in Athento, has moved on to another process.

Depending on the needs of each business or organization, the “Recorded” state could mean a number of things: 

  • Archive documents in paper once they’ve been validated in Athento 
  • Eliminate documents on paper
  • Carry out other operation(s) with documents validated in Athento 
  • Carry out other operation(s) with documents validated in Athento, from another application, using Athento’s web services
  • Other operations 

Athento’s web services allow you to perform any operation with your documents, as we’ve mentioned in one of the previous points. However, they also allow us to process the states of life cycles of documents in such a way that the entire process, after the validation, can be done from any application that we connect to Athento.

Aside from the logical advantage of using life cycles in document management, searches by facets in Athento allows us to search all of our documents that have been Validated, that are Reviewing, or which have been Recorded, according to the needs for searches at every moment. 

You can see an example in this video:

New "Recorded" state of the cycle of life of documents in Athento from Athento on Vimeo. Share

Monday, November 18, 2013

Recognizing handwritten letters: the state of the art with ICR technologies

Today, we’re going to talk a bit about one type of document capture technology: Intelligent Character Recognition (ICR) technology. ICR software which has been created to convert handwritten letters into text that can be recognized and read by computers.

This technology isn’t as advanced as OCR technology – it has a lot of problems and the results aren’t as accurate. The degree of difficulty in reading words that are written by hand isn’t even something that humans are good at mastering: when was the last time you tried to make out the content of a doctor’s prescription?

Nonetheless, this technology looks like it’s moving forward. Some companies such as Parascript are talking about ICR systems being able to read up to 95% of handwritten texts, with an error rate of about 2%. 

But what have been the developments of solutions when it comes time to read handwritten characters in documents and convert them into digital information? 

  1. The first solution that came to the minds of humans was to have people read digitized information on paper and then put it into the computer themselves. This system is still being used in government organizations, hospitals, banks, educational institutions, etc. It goes without saying that because this solution is manual, it’s also expensive; but it solves an even greater problem – recovering information – for businesses. 
  2. Next, we started using boxes in documents to force people to write in a specific space, as you can see in the example below:
    This was because the ICR technology that was available at the time couldn’t recognize the characters if they were touching each other. 
  3. After that, the idea of printing documents in "drop-out ink" (pastel colors, most of which would block reading by OCR) came along. With this, it became possible to make the ICR only read handwritten characters without added noise. According to Imerge Consulting, this solution alone could eliminate 60% of the workers dedicated to data entry. Up until now, ICRs worked by identifying letters one by one (box by box), but what’s being looked for in the industry is a “criteria of usability” at the field level (or, put another way, that the complete word or sentence is correct and makes sense).
  4. Truth is, we don’t live in a world where people always write inside the lines. That’s why ICR technologies have been reinforced over the past few years for freely-written text. Since the letters aren’t constricted within boxes, we face an endless number of additional problems; for example, what happens when the width of the letters varies, or the letters touch or overlap, etc. New algorithms are currently being used, some of which compare the handwritten characters against an immense database of images, analyzing the parts, linguistic patterns, etc. With all of this, results still aren’t as good as those obtained by using OCR. Those defending the use of OCR affirm that whatever accuracy rate reached still translates into reduced labor costs. What’s certain is that aspects like the reading of cursive written, still don’t have a solution that could fix them. 


Friday, November 15, 2013

Which operating systems can you run Athento on? [FAQs]

Our friend Alberto Lara has asked us a question: specifically, he’d like to know one thing about Athento: Can it be used on Windows and Linux?

Athento is an application which has been developed in JAVA. As many of you know, JAVA is a multi-platform programming language, which means that its ability to operate isn’t tied to one specific operating system. 

When you download Athento from our web page, it gives you the chance to decide whether you want to work in Linux or with Windows, which are currently the most widely used operating systems around the world for development environments. In accordance with the operating system that you work with, you should choose one installation file or the other. The difference between them is that the libraries of managing image files is incorporated into the operating system, which means that you have a specific library for each operating system.

At a higher level, as users, you shouldn’t experience any difference between working with Athento in Windows or in Linux-based systems (like Debian or Red Hat).

Alberto, I hope that that’s answered your question.

I’d like to remind you of some links of interest around this topic:

Downloading the application
Installing the application in Linux
Installing the application in Windows


Tuesday, November 12, 2013

Automatic Relations between documents in Athento

To make it quicker to access documentation, it’s often fundamental to be able to access one document from inside another document that it has a direct relationship with. In document managers, relations are built manually. Put simply, the system is shown which document is related to other document(s). In Athento, these relationships are built automatically when different documents contain the same piece of metadata that has the same value. For example, let’s suppose that in a school or educational center, the documents for each new student are stored: the student’s student card and a national identity number. For both types of documents, the master metadata type called “national identity number” is created in order to be extracted. Once Athento carries out data extraction on those documents, each time that Athento finds documents that coincide with the value of the piece of metadata called “national identity number”, the value lies in the system being able to associate documents which belong to the same file for the student.

Below, we’re going to see the example of various contracts which share the piece of metadata annotation, and how, from the “Relations” tab in Athento, you can see the documents in which the value of this piece of metadata coincide.

Documents automatically related in Athento from Athento on Vimeo. Share

Friday, November 8, 2013

The difference between version control and versioning

A couple of weeks ago, I received an interesting e-mail from the Real Story Group. The e-mail talked about the differences between these two terms, and even though they’re closely related, they mean two different things. Although the material from RSG deals more with Web Content Management applications (such as Drupal or Joomla), these concepts are also applicable to document management.

Version Control
For the people at RSG, “version control” means a collection of functionalities which knows ahead of time that the people who are working on that specific content will get in each others’ way. For example, when two people are working at the same time on a document, they’re continually writing over the content, in such a way that one person’s work is getting lost. Document managers and ECM (Enterprise Content Management) systems, just like WCMs, implement check-in and check-out functionalities: what they do is block the content or the documents when there’s a user working on it/them. In document management and ECM, however, version control goes a little further and allows for the user of a revision history for work performed on a document, its different versions, when previous versions had been recovered, etc. 

Versioning is the ability that an ECM, WCM or document manager gives to store and save different versions of the same content or document. The goal of this capability is to let us recover previous versions if we want them, for example, when we’ve made a mistake. Remember that the version of a document or digital content is a variation of a digital asset or its metadata: in other words, a new word means having an update, edit or change with respect to a previous version (or its metadata)

Normally, in document management, both the capabilities for versioning, as well as for version control, tend to be grouped within the terminology of “version control”. 

Thursday, November 7, 2013

Managing employee contract files with Athento

One of the areas in which businesses most need an efficient, powerful document manager the most is in human resources management. Activities surrounding personnel management, and, above all, anything related to hiring, tend to be document-intensive. What’s more, for large businesses with satellite offices but a centralized HR function (or for professional administrators with different clients), signing contracts, to give an example, can turn into a process which isn’t sufficiently efficient. If we’ve got someone working in Málaga, but we’ve got to bring each file for each worker down from Madrid, we’ve got to then sign it in Málaga and then send the contracts back to Madrid.

How to manage this process more efficiently:
The answer is simple: this process should be managed in digital format:

  1. Employees bring in documents: New employees need to bring in documents like national identity cards, social security registration numbers, etc. It should be possible to do this from various sources, such as by using e-mail, from a website, in paper format to be digitized, from third-party applications, etc. This way, what we need is to make all of the capabilities for capturing documents for our information systems more powerful. 
  2. Open a file for the new hire: To make sure that we’ve got everything for the employee’s work history, we should create a new file in our document manager, and this file will be in charge of storing all documentation provided by the worker. This file should be saved with all document management guarantees for a minimum of five years.
  3. Make the file accessible to the professional administrator or the client (depending on the case): In Spain, there are a lot of cases in businesses where everything concerning hiring a new employee is carried out by a professional administrator (who has his own business separate from the client’s business). These professional administrators need to consult the worker’s documentation. Or, in the case that we work for one of these professional managers, we need the client who employs the worker to have access to this documentation. Going back to the first scenario, any documents produced by the administrator will also have to be sent to the file.  
  4. Send the contracts ready to be signed to the person in charge:  The person in charge of hiring should be notified electronically each time there are new contracts to be signed. Additionally, this person should be able to see, at any time) which contracts have yet to be signed. 
  5. A digital signature from the employee: Digital signatures can be done using mobile devices.

What do we gain from this method of managing hiring processes?

  • Speed in carrying out the process
  • Reduced risk of losing associated documents
  • Reduced paper consumption
  • Guarantees that signed contracts are always up-to-date
  • Makes it much easier to recover information
  • Helps make it possible for all the people involved in the process to have the documentation that they need, at that moment, and without delays. 


Monday, November 4, 2013

Within the same tool (Athento), is it possible to configure and orchestrate workflows and not just integrate them with already-existing work flows? [FAQs]

This question comes to us from Francisco Nazario Santiago, who works in Mexico.

Managing review/approval workflows
Within Athento’s capture module, workflows aren’t managed as such: that’s done in the ECM module. The capture module is in charge of being that entry point for documents,  whether it’s done via capture of e-mails, monitoring folders, uploading documents from the platform itself, capture from Dropbox, capture from an ECM system, etc. – to the document management system and to obtain relevant information from them. In other words, once documents are captured, classified and their data has been obtained, the documents can be sent to an electronic document management system, which could be Athento’s very own ECM module, or that of any other repository (SharePoint, Alfresco, Nuxeo, OpenText, etc.)

From the ECM system, workflows can be configured. In the case of Athento ECM, it’s possible to configure revision workflows or approval workflows from the platform’s own interface.

Work flows can be parallel or serial (put another way, various people can review the document at the same time without needing to consider the order in which they do it) or sequential reviews in which one person can’t review a document unless another person has reviewed it before.

As a default, without any kind of parameterization and without additional costs, users of Athento ECM can use this functionality.

Business workflows
These work flows are adapted to the specific needs of your business. Normally, they are complex work flows, which include multiple decision paths and diverse people involved in the process.

These workflows require modeling of the flow and, afterwards, being included in the ECM tool. They have to be studied, analyzed and put into action. Athento’s applications are built on a development framework that considers the design and implementation of these work flows. Specifically, at its heart, Athento allows for the use of a high-level service called "Athento Workflow". This service permits users to work with workflows in  JBPM5 and Drools, which permit users to define tasks, activities, rules and phases of work flows that can be completely customized. For more information about what users can do with Athento, you can read the post that talks about integration with Drools. it’s also possible to work with flows with Athento and Bonita.

Although Athento ECM already comes prepared to manage work with complex workflows, these types of workflows require some parameterizing and adjustments to reflect the client’s particular situation. That’s why they don’t come included in the default version, and require separate billable hours of development.