Releases: apify/crawlee-python
Releases · apify/crawlee-python
1.2.1
1.2.1 (2025-12-16)
🐛 Bug Fixes
1.2.0
1.1.1
1.1.1 (2025-12-02)
🐛 Bug Fixes
- Unify separators in
unique_keyconstruction (#1569) (af46a37) by @vdusek - Fix
same-domainstrategy ignoring public suffix (#1572) (3d018b2) by @Pijukatel - Make context helpers work in
FailedRequestHandlerandErrorHandler(#1570) (b830019) by @Pijukatel - Fix non-ASCII character corruption in
FileSystemStorageClienton systems without UTF-8 default encoding (#1580) (f179f86) by @Mantisus - Respect
<base>when enqueuing (#1590) (de517a1) by @Mantisus
1.1.0
1.1.0 (2025-11-18)
🚀 Features
- Add
chromeBrowserTypeforPlaywrightCrawlerto use the Chrome browser (#1487) (b06937b) by @Mantisus - Add
RedisStorageClientbased on Redis v8.0+ (#1406) (d08d13d) by @Mantisus - Add support for Python 3.14 (#1553) (89e9130) by @Mantisus
- Add
transform_request_functionparameter forSitemapRequestLoader(#1525) (dc90127) by @Mantisus
🐛 Bug Fixes
- Improve indexing of the
request_queue_recordstable forSqlRequestQueueClient(#1527) (6509534) by @Mantisus - Improve error handling for
RobotsTxtFile.load(#1524) (596a311) by @Mantisus - Fix
crawler_runtimenot being updated during run and only in the end (#1540) (0d6c3f6) by @Pijukatel - Ensure persist state event emission when exiting
EventManagercontext (#1562) (6a44f17) by @Pijukatel
1.0.4
1.0.4 (2025-10-24)
🐛 Bug Fixes
- Respect
enqueue_strategyinenqueue_links(#1505) (6ee04bc) by @Mantisus - Exclude incorrect links before checking
robots.txt(#1502) (3273da5) by @Mantisus - Resolve compatibility issue between
SqlStorageClientandAdaptivePlaywrightCrawler(#1496) (ce172c4) by @Mantisus - Fix
BasicCrawlerstatistics persistence (#1490) (1eb1c19) by @Pijukatel - Save context state in result for
AdaptivePlaywrightCrawlerafter isolated processing inSubCrawler(#1488) (62b7c70) by @Mantisus
1.0.3
1.0.3 (2025-10-17)
🐛 Bug Fixes
- Add support for Pydantic v2.12 (#1471) (35c1108) by @Mantisus
- Fix database version warning message (#1485) (18a545e) by @Mantisus
- Fix
reclaim_requestinSqlRequestQueueClientto correctly update the request state (#1486) (1502469) by @Mantisus - Fix
KeyValueStore.auto_saved_valuefailing in some scenarios (#1438) (b35dee7) by @Pijukatel
1.0.2
1.0.1
1.0.1 (2025-10-06)
🐛 Bug Fixes
- Fix memory leak in
PlaywrightCrawleron browser context creation (#1446) (bb181e5) by @Pijukatel - Update templates to handle optional httpx client (#1440) (c087efd) by @Pijukatel
1.0.0
1.0.0 (2025-09-29)
- Check out the Release blog post for more details.
- Check out the Upgrading guide to ensure a smooth update.
🚀 Features
- Add utility for load and parse Sitemap and
SitemapRequestLoader(#1169) (66599f8) by @Mantisus - Add periodic status logging and
status_message_callbackparameter for customization (#1265) (b992fb2) by @Mantisus - Add crawlee-cli option to skip project installation (#1294) (4d5aef0) by @Pijukatel
- Improve
CrawleeCLI help text (#1297) (afbe10f) by @Pijukatel - Add basic
OpenTelemetryinstrumentation (#1255) (a92d8b3) by @Pijukatel - Add
ImpitHttpClienthttp-client client using theimpitlibrary (#1151) (0d0d268) by @Mantisus - Prevent overloading system memory when running locally (#1270) (30de3bd) by @janbuchar
- Expose
PlaywrightPersistentBrowserclass (#1314) (b5fa955) by @Mantisus - Add
impitoption for Crawlee CLI (#1312) (508d7ce) by @Mantisus - Persist RequestList state (#1274) (cc68014) by @janbuchar
- Persist
DefaultRenderingTypePredictorstate (#1340) (fad4c25) by @Mantisus - Persist the
SitemapRequestLoaderstate (#1347) (27ef9ad) by @Mantisus - Add support for NDU storages (#1401) (5dbd212) by @vdusek
- Add RQ id, name, alias args to
add_requestsandenqueue_linksmethods (#1413) (1cae2bc) by @Mantisus - Add
SqlStorageClientbased onsqlalchemyv2+ (#1339) (07c75a0) by @Mantisus
🐛 Bug Fixes
- Fix memory estimation not working on MacOS (#1330) (ab020eb) by @Pijukatel
- Fix retry count to not count the original request (#1328) (74fa1d9) by @Pijukatel
- [breaking] Remove unused "stats" field from RequestQueueMetadata (#1331) (0a63bef) by @vdusek
- Ignore unknown parameters passed in cookies (#1336) (50d3ef7) by @Mantisus
- Fix
timeoutforstreammethod inImpitHttpClient(#1352) (54b693b) by @Mantisus - Include reason in the session rotation warning logs (#1363) (d6d7a45) by @vdusek
- Improve crawler statistics logging (#1364) (1eb6da5) by @vdusek
- Do not add a request that is already in progress to
MemoryRequestQueueClient(#1384) (3af326c) by @Mantisus - Save
RequestQueueStateforFileSystemRequestQueueClientin default KVS (#1411) (6ee60a0) by @Mantisus - Set default desired concurrency for non-browser crawlers to 10 (#1419) (1cc9401) by @vdusek
Refactor
- [breaking] Introduce new storage client system (#1194) (de1c03f) by @vdusek
- [breaking] Split
BrowserTypeliteral into two different literals based on context (#1070) (72b5698) by @Pijukatel - [breaking] Change method
HttpResponse.readfrom sync to async (#1296) (83fa8a4) by @Mantisus - [breaking] Replace
HttpxHttpClientwithImpitHttpClientas default HTTP client (#1307) (c803a97) by @Mantisus - [breaking] Change Dataset unwind parameter to accept list of strings (#1357) (862a203) by @vdusek
- [breaking] Remove
Request.idfield (#1366) (32f3580) by @Pijukatel - [breaking] Refactor storage creation and caching, configuration and services (#1386) (04649bd) by @Pijukatel
0.6.12
0.6.12 (2025-07-30)
🚀 Features
🐛 Bug Fixes
- Use
perf_counter_nsfor request duration tracking (#1260) (9e92f6b) by @Pijukatel, closes #1256 - Fix memory estimation not working on MacOS (#1330) (8558954) by @Pijukatel, closes #1329
- Fix retry count to not count the original request (#1328) (1aff3aa) by @Pijukatel, closes #1326
- Ignore unknown parameters passed in cookies (#1336) (0f2610c) by @Mantisus, closes #1333