public abstract class OCREngine
extends java.lang.Object
TesseractOCREngine
,
GOCREngine
Modifier and Type | Field and Description |
---|---|
float |
defaultZoomScale |
static java.lang.String |
OCR_DEFAULT_ENGINE_CLAZZ |
static java.lang.String |
OCR_DEFAULT_ENGINE_KEY |
static java.lang.String |
OCR_G_ENGINE_CLAZZ |
static java.lang.String |
OCR_G_ENGINE_KEY |
static java.lang.String |
OCR_T_ENGINE_CLAZZ |
static java.lang.String |
OCR_T_ENGINE_KEY |
static java.lang.String |
STAF_OCR_ENGINE_VAR_NAME |
static java.lang.String |
STAF_OCR_LANGUAGE_ID_VAR_NAME |
Constructor and Description |
---|
OCREngine() |
Modifier and Type | Method and Description |
---|---|
java.awt.Rectangle |
findTextRectFromImage(java.lang.String searchtext,
int index,
java.awt.image.BufferedImage image,
java.lang.String stdlangId,
java.awt.Rectangle subarea,
float zoom) |
float |
getdefaultZoomScale() |
static OCREngine |
getOCREngine(java.lang.String ocrNameKey,
STAFHelper staf)
Note:
This method first try to get OCR engine corresponding to ocrNameKey; If not found, try to get the OCR engine defined by STAF variable STAF_OCR_ENGINE_VAR_NAME; If there is no OCR engine defined in STAF variable, then use the default engine defined by OCR_DEFAULT_ENGINE_KEY. |
static java.lang.String |
getOCREngineKey(STAFHelper staf)
Note:
This method will get the value of STAF variable STAF_OCR_ENGINE_VAR_NAME |
static java.lang.String |
getOCRLanguageCode(STAFHelper staf)
Note:
This method will get the value of STAF variable STAF_OCR_LANGUAGE_ID_VAR_NAME |
protected java.lang.String |
getSelfDefinedLangId(java.lang.String langId)
Translate the standard language code to OCR specific language code.
|
java.lang.String |
imageToText(java.awt.image.BufferedImage image,
java.lang.String langId,
java.awt.Rectangle subarea) |
java.lang.String |
imageToText(java.awt.image.BufferedImage image,
java.lang.String langId,
java.awt.Rectangle subarea,
float zoom)
Convert buffered image to text using OCR technology.
|
static void |
main(java.lang.String[] args)
Usage: java OCREngine imageFile [-e engine] [-l languageID] [-z scale]
|
protected java.lang.String |
runCommandLine(java.lang.String cmdline) |
void |
setdefaultZoomScale(float value) |
static boolean |
setOCREngineKey(STAFHelper staf,
java.lang.String engineKey)
Note:
This method will set the value to STAF variable STAF_OCR_ENGINE_VAR_NAME |
static boolean |
setOCRLanguageCode(STAFHelper staf,
java.lang.String languageCode)
Note:
This method will set the value to STAF variable STAF_OCR_LANGUAGE_ID_VAR_NAME |
java.lang.String |
storedImageToText(java.lang.String imagefile,
java.lang.String langId,
java.awt.Rectangle subarea) |
java.lang.String |
storedImageToText(java.lang.String imagefile,
java.lang.String langId,
java.awt.Rectangle subarea,
float zoom)
Convert an image file to text using OCR technology.
|
java.awt.image.BufferedImage |
zoomImageWithType(java.awt.image.BufferedImage image,
int imageType,
java.awt.Rectangle subarea,
float zoomValue) |
public static java.lang.String STAF_OCR_ENGINE_VAR_NAME
public static java.lang.String STAF_OCR_LANGUAGE_ID_VAR_NAME
public static java.lang.String OCR_T_ENGINE_KEY
public static java.lang.String OCR_G_ENGINE_KEY
public static java.lang.String OCR_T_ENGINE_CLAZZ
public static java.lang.String OCR_G_ENGINE_CLAZZ
public static java.lang.String OCR_DEFAULT_ENGINE_KEY
public static java.lang.String OCR_DEFAULT_ENGINE_CLAZZ
public float defaultZoomScale
public float getdefaultZoomScale()
public void setdefaultZoomScale(float value)
public java.lang.String imageToText(java.awt.image.BufferedImage image, java.lang.String langId, java.awt.Rectangle subarea) throws SAFSException
SAFSException
public java.lang.String imageToText(java.awt.image.BufferedImage image, java.lang.String langId, java.awt.Rectangle subarea, float zoom) throws SAFSException
image,
- an input BufferedImage for converting to text, supposed to be those images displayed on
computer CRT. These 'screen-captured' images are at low DPI (75). It needs to be resized for
using in OCR engine. Tesseract uses images with 300DPI.langId,
- language id representing the language that OCR intends to convert to.subarea,
- area of the input image for convert. NULL stands for whole area of the image.zoom,
- float, it is an optional parameter representing zoom value.
SAFS uses a simple way -- resizing the image to fit 300DPI that is required by OCR for better text recognition.
Normal screen-captured images are at low DPI-- (75~90).
A value between 0 and 1, stands for the size of zooming out the source image to fit 300DPI for text recognition.
A value bigger than 1, stands for the size of zooming in the source image to fit GOCR's requirement.
User may give a proper zoom value for the text in a image to be fit in and recognized.SAFSException
- if meets any Exceptionpublic java.lang.String storedImageToText(java.lang.String imagefile, java.lang.String langId, java.awt.Rectangle subarea) throws SAFSException
SAFSException
protected java.lang.String getSelfDefinedLangId(java.lang.String langId)
langId,
- standard language codepublic java.awt.Rectangle findTextRectFromImage(java.lang.String searchtext, int index, java.awt.image.BufferedImage image, java.lang.String stdlangId, java.awt.Rectangle subarea, float zoom) throws SAFSException
SAFSException
public java.lang.String storedImageToText(java.lang.String imagefile, java.lang.String langId, java.awt.Rectangle subarea, float zoom) throws SAFSException
imagefile,
- image file with formats BMP,GIF,JPEG,PNG and TIFFlangId,
- if useful, it represents the language that OCR intends to convert to. NULL means OCR doesn't care languages.subarea,
- area of the input image for convert. NULL stands for whole area of the image.zoom,
- float, it is an optional parameter representing zoom value.
SAFS uses a simple way -- resizing the image to fit 300DPI that is required by OCR for better text recognition.
Normal screen-captured images are at low DPI-- (75~90).
A value between 0 and 1, stands for the size of zooming out the source image to fit 300DPI for text recognition.
A value bigger than 1, stands for the size of zooming in the source image to fit GOCR's requirement.
User may give a proper zoom value for the text in a image to be fit in and recognized.SAFSException
- if meets any Exceptionpublic java.awt.image.BufferedImage zoomImageWithType(java.awt.image.BufferedImage image, int imageType, java.awt.Rectangle subarea, float zoomValue) throws SAFSException
image
- an BufferedImage for zooming.imageType
- image type of the zoomed image for the BufferedImage returned.
Usually use BufferedImage.TYPE_BYTE_GRAY in 8 BitPerPixel.subarea,
- area of the input image for convert. NULL stands for whole area of the image.zoomValue,
- a float, means the size of zoom-in if between 0 and 1; means the size of zoom-out if bigger than 1.SAFSException
protected java.lang.String runCommandLine(java.lang.String cmdline) throws SAFSException
SAFSException
public static OCREngine getOCREngine(java.lang.String ocrNameKey, STAFHelper staf) throws SAFSException
ocrNameKey
- The name of OCR engine. It can be constant OCR_T_ENGINE_KEY or OCR_G_ENGINE_KEY.
It can be null, in this case, we try to get OCR engine defined by STAF variable
STAF_OCR_ENGINE_VAR_NAME.staf
- SAFSException
public static java.lang.String getOCREngineKey(STAFHelper staf)
staf
- public static boolean setOCREngineKey(STAFHelper staf, java.lang.String engineKey)
staf
- engineKey
- public static java.lang.String getOCRLanguageCode(STAFHelper staf)
staf
- public static boolean setOCRLanguageCode(STAFHelper staf, java.lang.String languageCode)
staf
- languageCode
- public static void main(java.lang.String[] args)
args
- Copyright © SAS Institute. All Rights Reserved.