ability to use different ocrengine as enum, update of example.py to include ocrengine, clean up of README.md, include .vscode/ in .gitignore, update setup.py to require requests#13
Conversation
.gitignore - update to include .vscode/ files main.py - import Enum and unique from enum module to declare for the new OCREngine_VAL enum class - declare OCREngine_VAL enum class and insert two values, engine_1 which is equal to 1 and engine_2 which is equal to 2 - use the declared enum class in API.__init__, make engine_1 default - check if the passed value for ocrengine is an instance of OCREngine_VAL (to enforce usage of the enum class) and the value is no different from 1 or 2. - use the ocrengine variable inside payload example.py - update to use the OCREngine_VAL enum class when wanting to change the ocrengine of the api README.md - clean up the README.md file with spaces and well defined programming languages for the code blocks - insert a description on how to change the ocrengine with example
main.py - correct the typo when checking if it's an instance of OCREngine_VAL
main.py - give better description when throwing exception example.py - include example using api_with_engine_two
main.py - change the error raised for isinstance() failure from ValueError TypeError - change the error raise for wrong value from Exception() to ValueError()
setup.py - include the dependecy needed for the lib inside setup()
README.md - update README to describe getting apikey from Free OCR API. main.py - refactor processing of files into it's own method named query_api(). - pass timeout parameter to request.post() inside query_api() to specifiy for how to wait before raise a Timeout error instead of hanging the process for undefined amount of time. - call query_api() in all the ocr processors.
main.py - rename OCREngine_VAL to Engine - capitalize the values inside class Engine - rename parameter ocrengine inside __init__() to engine - simplify raised TypeError's message - rename parameter pic_file to image_file in query_api() example.py - update example.py in accordance with main.py
main.py - rename OCREngine_VAL to Engine - capitalize the values inside class Engine - rename parameter ocrengine inside __init__() to engine - simplify raised TypeError's message - rename parameter pic_file to image_file in query_api() example.py - update example.py in accordance with main.py
requirements.txt - remove unnecessarily listed dependencies - remove specific versioning of requests main.py - remove unnecessary except blocks
updated in par with 2nd review
updated in par with 2nd review
requirements.txt - remove unnecessarily listed dependencies - remove specific versioning of requests main.py - remove unnecessary except blocks
requirements.txt - remove unnecessarily listed dependencies - remove specific versioning of requests main.py - remove unnecessary except blocks
main.py - remove key api_key from payload dict - insert new self attribute named api_key - add a headers parameter in requests.post() and place api_key in headers This was done in accordance with the documentation and examples provided by Free OCR API https://ocr.space/ocrapi#postman
|
|
||
|
|
||
| def ocr_file(self, fp): | ||
| def query_api(self, url_data=None, image_file=None): |
There was a problem hiding this comment.
Actually I think you can just leave url_data. I shy away from using data in names, but here I think it's appropriate.
There was a problem hiding this comment.
Actually, hold on—is there anything preventing us from separating this into two parameters, one for base64 image data and one for a URL?
There was a problem hiding this comment.
IMHO, it wouldn't make sense to separate them cause the query they conduct is handled by the same elif statement inside query_api()
elif url_data:
r = requests.post(
self.endpoint,
data=url_data,
timeout=30
)There was a problem hiding this comment.
Sure, but we could very easily recombine them; this would just provide syntactic sugar.
There was a problem hiding this comment.
yea but separating something just to recombine it later on...seems kinda clunky as well as inefficient
There was a problem hiding this comment.
I it's well worth it to have the meaning of the parameter be clear. url_data is pretty ambiguous and could mean a lot of things. One or statement to allow the meaning of both these parameters to be self-documenting is a worthy compromise.
main.py - remove Engine: from docstring for Engine - rename ocreengine to engine in docstring for API - remove import & from ValueError message - update query_api() to if/elif/else statement when querying OCR api README.md - use clear & concise wording in the Use section - rename OCREngine_VAL to Engine - rename OCRENGINE_VAL.val_1 & OCRENGINE_VAL.val_2 to Engine.ENGINE_2 & ENGINE.ENGINE_1
* It might solve the TimeoutError
main.py
- remove key api_key from payload dict
- insert new self attribute named api_key
- add a headers parameter in requests.post() and place api_key in
headers
N.B. This was done in accordance with the documentation and examples provided by Free OCR API https://ocr.space/ocrapi#postman
|
|
||
|
|
||
| def ocr_file(self, fp): | ||
| def query_api(self, url_data=None, image_file=None): |
There was a problem hiding this comment.
Sure, but we could very easily recombine them; this would just provide syntactic sugar.
README.md - use single quotes instead of triple codes for describing the apikey in the Use section - remove description and import of Engine, use instead full package name - update example code to use full package name for Engine
ErikBoesen
left a comment
There was a problem hiding this comment.
Thanks so much for your help! Just a few more things to fix.
|
|
||
|
|
||
| def ocr_file(self, fp): | ||
| def query_api(self, url_data=None, image_file=None): |
There was a problem hiding this comment.
I it's well worth it to have the meaning of the parameter be clear. url_data is pretty ambiguous and could mean a lot of things. One or statement to allow the meaning of both these parameters to be self-documenting is a worthy compromise.
| :param fp: A path or pointer to your file | ||
| :return: Result in JSON format | ||
| """ | ||
| with (open(fp, 'rb') if type(fp) == str else fp) as f: |
There was a problem hiding this comment.
This is kind of dense, can we reorganize to be a bit clearer?
There was a problem hiding this comment.
can you clarify on this. Do you want to reorganize the code or the documentation? and if so how do you want to reorganize the code or documentation? Dense doesn't really communicate what you want to do
There was a problem hiding this comment.
The code. As in, split the ternary if/else into a full if/else statement across multiple lines so that we don't have a ton of logic in a single line making it difficult to read and understand at a glance
There was a problem hiding this comment.
can you clarify on this. Do you want to reorganize the code or the documentation? and if so how do you want to reorganize the code or documentation? Dense doesn't really communicate what you want to do
The code could be made more readable (or more verbose).
LICENSE - remove h1 tag from header README.md - remove duplicates of Engine - remove spaces from the authors' names & urls setup.py - bump version to 2.4.0 - restore author's email main.py - remove unique and paranthesis from enum import - replace Enum import by IntEnum - replace double quotes with single quotes - rename url_data to image_url with a better description in the docs
ErikBoesen
left a comment
There was a problem hiding this comment.
oops, wrote these comments up a long time ago and then never submitted the review. Looks like I was only looking for some small changes so should be straightforward
| @@ -1,7 +1,8 @@ | |||
| MIT License | |||
| MIT License | |||
There was a problem hiding this comment.
No need for leading space either
| """ | ||
| if not isinstance(engine, Engine): | ||
| raise TypeError('engine must be an instance of Engine') | ||
| if engine.value != 1 and engine.value != 2: |
There was a problem hiding this comment.
I don't think this check is needed; it's impossible to make an IntEnum class with an invalid value as far as I know
| :param fp: A path or pointer to your file | ||
| :return: Result in JSON format | ||
| """ | ||
| with (open(fp, 'rb') if type(fp) == str else fp) as f: |
There was a problem hiding this comment.
The code. As in, split the ternary if/else into a full if/else statement across multiple lines so that we don't have a ton of logic in a single line making it difficult to read and understand at a glance
| Process image from a local path. | ||
| :param fp: A path or pointer to your file | ||
| Process the provided parameter. | ||
| :param image_url: An Image url or base64image encoded string |
There was a problem hiding this comment.
You may have forgotten to use this parameter in the function
ability to use different ocrengine as enum, update of example.py to include ocrengine, clean up of README.md, include .vscode/ in .gitignore, update setup.py to require requests