draft pull request for review#21
Draft
zq46 wants to merge 9 commits into
Draft
Conversation
… for loop" This reverts commit ca21311.
Contributor
Author
|
上面 peek 写的有 bug,没有返回 Stream,稍后修复。 最后一条增加方法 zip_by, zip_with, concat_by, concat_with 。 Stream(iterable).collects(lambda _: itertools.chain(iterA, _, iterB))
Stream(iterable).collects(lambda _: zip(iterA, _, iterB)) |
Owner
|
有点多啊,最近比较忙,不好意思,我晚点仔细看下 |
Contributor
Author
|
就是 star_map 的修改会有影响,已经单独 PR,你也已经合并了。 |
Contributor
Author
|
拆出来单独提交 PR 了,这样看起来应该方便点。 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
feat: add take_while and drop_while
实现 Java 9 Stream 的 takeWhile 和 dropWhile
fix: rename starmap to star_map, to keep the naming style consistent
为了命名风格统一,改 starmap 为 star_map
feat: add peek, implemented with laziness
加回了 peek 操作,使用 yield 让其成为惰性操作。
虽然我也没怎么用到 peek 操作,但有几个理由让我加了回来:
(1) 对应 Java Stream;(2) 实现简单;(3) 惰性加载,消耗不大;
(4) 参考 more_itertools.side_effect 函数, peek 不仅可以用于调试,还可以用于记录日志、更新进度条等等,或许有人用得着。
refactor: speed up for_each, by using deque to avoid explicit for loop
通过 deque 和 map 操作将 for_each 中 func 的循环放到 C 语言中, 避免 python 的显式 for 循环。
itertools recipes 中用 maxlen=0 的 deque 实现 consume 函数,用于快速消耗 iterator 。
在其他地方也见到 deque 的类似用途,例如我们对 count 的实现。
我已经测试过,这个实现比显式 for 循环快一些。
refactor: speed up group_by, by using deque to avoid explicit for loop
Revert "refactor: speed up group_by, by using deque to avoid explicit for loop"
本想同样用 deque 加速 group_by,但测试发现比显式 for 循环慢一些,大概是由于需要额外定义
_classify函数。换用如下的生成器表达式,还是会慢。可能相比显式 for 循环,还是多出一些额外操作吧。
deque((groups.setdefault(classifier(i), []).append(i) for i in self._stream), maxlen=0)所以 group_by 的这两个提交,我后面会删除。
feat: make distinct lazy, and compatible with seq of unhashable items
将 distinct 操作惰性化,并支持包含不可哈希元素的序列。
使用了 itertools recipes 和 more_itertools 的 unique_everseen 函数。
例如看一下前10个不重复的元素,就没必要全部去重。
stream(iterable).distinct().limit(10).for_each(print)feat: add optional parameter for map, to support multiple processing
这个是我为了使用并行 map 的权宜之计。
因为 Stream 暂时不支持并行 parallel,我也没有实现整个 Stream 并行的方法,
只要先局部 map 并行将就一下了。
这个提交合并进来可能不太适合。
feat: add zip_by, zip_with, concat_by, concat_with for convenience
当一个流处理到中间某一步,需要在它前面或后面与别的 iterable 连接或 zip 时,
为了不另起一个表达式而打断流,加了这几个方法。
这是我在想出 collects 之前加的,现在用 collects 加上 lambda 匿名函数当然也可以,但显得很冗长难看。
这几个方法的功能和实现没有什么问题,问题在:
(1) 是用一个方法加参数来处理加在前面还是后面的情况?还是像现在这样用两个?
(2) 方法的名字还要再斟酌,现在的名字不好。