|
OCR (optical character recognition) is the recognition of printed or written text characters by a computer. This involves photoscanning of the text character-by-character, analysis of the scanned-in image, and then translation of the character image into character codes, such as ASCII, commonly used in data processing. In OCR processing, the scanned-in image or bitmap is analyzed for light and dark areas in order to identify each alphabetic letter or numeric digit. When a character is recognized, it is converted into an ASCII code. Special circuit boards and computer chips designed expressly for OCR are used to speed up the recognition process. OCR is being used by libraries to digitize and preserve their holdings. OCR is also used to process checks and credit card slips and sort the mail. Billions of magazines and letters are sorted every day by OCR machines, considerably speeding up mail delivery. MEAN : OCR stands for Optical Character Recognition but OCR is really a process of taking an image and converting it into text so that it can be edited and searched. Most scanners now come with OCR software, which will allow you to scan a document straight to text, Word, or PDF. The purpose of this is so that the document can be edited or searched without a person having to retype it. There is also software that will take an image, such as a jpg, gif, tiff, or some other image and will convert that to text as well. PAPER SCANNING : DEFINE : Space is hard to come by in an office. Every corner of your desk winds up with a purpose of its own, whether it’s a place for your pens or a spot for your coffee mug. And your desk suffers from some serious clutter sometimes. That happens. There are notes and inter-office memos to read and take care of and when you are done you sometimes forget to put them in the garbage. Or sometimes you need to keep them for a few weeks or longer. Scanning converts paper documents into digital image format. The most widely used digital file format for image storage is the Tagged Image File Format (TIFF), which can store black and white, grayscale, and color images. Typically, office documents are scanned at a resolution of 200 or 300 dpi (dots per inch). At 200 dpi, a letter-size page requires about 500K of memory or disk space, but only about 20-30K when compressed using CCITT G4, which is standard in the document management industry. Optix always compresses scanned images and passes only the compressed versions between the workstation and the server - this keeps network traffic lower. |









