Asynchronous REST OCR API
- TypeScript + Express
- Tesseract.js
- Redis
- Bull
sequenceDiagram
actor User
participant Server
participant OCRProcessor
participant RedisQueue
participant RedisDB
User ->> Server: Insert text extraction request (in: File)
activate Server
Server ->> RedisQueue: Put in Redis Queue
Server ->> User: Return "requestID"
deactivate Server
RedisQueue -->> OCRProcessor: Process Message
activate OCRProcessor
OCRProcessor ->> RedisDB: Insert result on Redis DB
deactivate OCRProcessor
User ->> Server: Ask for result (in: requestID)
activate Server
Server ->> RedisDB: Search result
RedisDB ->> Server: Get result
Server ->> User: Return text extraction result
deactivate Server
POST /api/ocr/recognition/file
Insert request for text extraction
GET /api/ocr/recognition/result
Get result of text extraction
GET /api/ocr/recognition/results
Get multiple results of text extractions in bulk
start-dev
: run locally on development modestart-prod
: run locally on production modetest
: run testbuild
: build projectprod
: run in production and load env vars from deployment environment
docker-build
: build docker imagedocker-run-dev
: locally run api and redis (without password)
-
Pull image:
docker pull redis
-
Run with password:
"docker run --name redis -d -p 6379:6379 redis redis-server --requirepass 'redispassword'"
-
Run without password:
"docker run --name redis -d -p 6379:6379 redis"
- Connect to Redis:
redis-cli
- Login:
AUTH [password]
- Clean all data:
FLUSHDB
andFLUSHALL
- Get value type:
TYPE [key]
- List all keys:
KEYS *
- Sorted set count:
ZCARD bull:recognize_eng:failed
- Show Sorted set element:
ZRANGE bull:recognize_eng:failed 0 0
- Read Hash:
HGETALL bull:recognize_eng:419
- Run unit test with
npm run test
. - Run specific test case with
npm run test -- ${testNamePattern}
. (Es:npm run test -- Redis
will run onlyRedis.test.ts
)
loadtest -n 20 -c 5 -P '{"url": "https://tesseract.projectnaptha.com/img/eng_bw.png","lang": "eng"}' http://localhost:8080/api/ocr/recognition -T application/json