Skip to content

🧹 Centralize duplicated helper functions in alpha191/utils.py#19

Open
Uwater1 wants to merge 1 commit into
mainfrom
refactor-duplicate-helpers-7650666659604456366
Open

🧹 Centralize duplicated helper functions in alpha191/utils.py#19
Uwater1 wants to merge 1 commit into
mainfrom
refactor-duplicate-helpers-7650666659604456366

Conversation

@Uwater1

@Uwater1 Uwater1 commented Mar 7, 2026

Copy link
Copy Markdown
Owner

Consolidated duplicated get_alpha_func and get_stock_codes helper functions from several scripts into alpha191/utils.py. The centralized functions are more robust and flexible, accommodating the different needs of various scripts (e.g., DataFrame vs code/benchmark input, varying error handling). This improvement enhances the maintainability and readability of the codebase.


PR created automatically by Jules for task 7650666659604456366 started by @Uwater1

This commit centralizes the `get_alpha_func` and `get_stock_codes` helper
functions that were duplicated across multiple test and assessment scripts.
The new implementation in `alpha191/utils.py` is more robust, supporting
different naming conventions (alphaXXX vs alpha_XXX) and providing unified
error handling.

Updated scripts:
- speedtest.py
- fulltest.py
- calculate_covariance.py
- select_alphas.py
- run_alpha_tests.py
- ICtest.py
- grouptest.py

Added unit tests for the new utility functions in tests/test_alphas.py.

Co-authored-by: Uwater1 <194621470+Uwater1@users.noreply.github.com>
@google-labs-jules

Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the codebase by centralizing commonly used helper functions into a dedicated utility module. This change streamlines the process of retrieving alpha functions and stock codes, eliminating duplicated logic across various analysis scripts. The primary impact is improved code consistency, easier maintenance, and enhanced flexibility for future development, as all scripts now rely on a single, robust implementation of these core utilities.

Highlights

  • Centralized Helper Functions: The get_alpha_func and get_stock_codes helper functions, previously duplicated across several scripts, have been consolidated into alpha191/utils.py. This centralization improves code maintainability and reduces redundancy.
  • Enhanced get_alpha_func Utility: The newly centralized get_alpha_func is more robust, supporting both alphaXXX and alpha_XXX naming conventions, and includes options for DataFrame-based functions (use_df) and error handling (ignore_errors).
  • Codebase Refactoring: Multiple scripts (ICtest.py, calculate_covariance.py, fulltest.py, grouptest.py, run_alpha_tests.py, select_alphas.py, speedtest.py) have been updated to import and utilize these new centralized utility functions, removing their own local implementations and associated imports.
  • Improved Test Coverage: New unit tests have been added in tests/test_alphas.py to ensure the correct functionality and robustness of the get_alpha_func and get_stock_codes utilities.
Changelog
  • ICtest.py
    • Removed local imports for sys, importlib, and Path.
    • Imported get_alpha_func from alpha191.utils.
    • Replaced the local alpha function import logic with a call to the new get_alpha_func.
  • alpha191/utils.py
    • Added get_alpha_func to dynamically import alpha functions with flexible options for naming conventions and error handling.
    • Added get_stock_codes to retrieve stock codes from a benchmark directory.
    • Updated type hints to include Union and Any for new function signatures.
  • calculate_covariance.py
    • Removed local imports for os, sys, and Path.
    • Removed the local get_alpha_func definition.
    • Imported get_alpha_func and get_stock_codes from alpha191.utils.
    • Updated the usage of get_alpha_func to include use_df and ignore_errors parameters.
    • Replaced local directory listing for stock codes with a call to get_stock_codes.
  • fulltest.py
    • Removed local imports for Path.
    • Removed the local get_alpha_func and get_stock_codes definitions.
    • Imported get_alpha_func and get_stock_codes from alpha191.utils.
  • grouptest.py
    • Removed local imports for sys, importlib, and Path.
    • Imported get_alpha_func from alpha191.utils.
    • Replaced the local alpha function import logic with a call to the new get_alpha_func.
  • run_alpha_tests.py
    • Removed local import for importlib.
    • Imported get_alpha_func from alpha191.utils.
    • Replaced the local alpha function import logic with a call to the new get_alpha_func.
  • select_alphas.py
    • Removed local import for importlib.
    • Removed the local get_alpha_function definition.
    • Imported get_alpha_func from alpha191.utils.
    • Updated the usage of get_alpha_function to call the new get_alpha_func with use_df and ignore_errors parameters.
  • speedtest.py
    • Removed local imports for Path.
    • Removed the local get_alpha_func and get_stock_codes definitions.
    • Imported get_alpha_func and get_stock_codes from alpha191.utils.
  • tests/test_alphas.py
    • Added test_get_alpha_func to verify the functionality of the centralized alpha function retrieval.
    • Added test_get_stock_codes to verify the functionality of the centralized stock code retrieval.
Activity
  • The pull request was created automatically by Jules for task 7650666659604456366, started by @Uwater1.
  • No human comments or reviews have been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a great refactoring effort that centralizes duplicated helper functions, get_alpha_func and get_stock_codes, into alpha191/utils.py. This significantly improves code maintainability and readability by removing redundant code from multiple scripts. The new centralized functions are also more robust. I've added a couple of suggestions to further improve the new utility function and its tests for better error handling, clarity, and test reliability.

Note: Security Review did not run due to the size of the PR.

Comment thread alpha191/utils.py
Comment on lines +194 to +234
def get_alpha_func(alpha_id: Union[int, str], use_df: bool = False, ignore_errors: bool = False) -> Optional[Any]:
"""
Get the alpha function by number or name.

Args:
alpha_id: Alpha number (e.g., 17) or name (e.g., "alpha017")
use_df: If True, return the function that takes a DataFrame (alpha_XXX).
If False, return the function that takes code/benchmark (alphaXXX).
ignore_errors: If True, return None on failure instead of raising ValueError.

Returns:
The alpha function or None if not found and ignore_errors is True.
"""
import importlib
try:
alpha_name = format_alpha_name(str(alpha_id))
module = importlib.import_module(f"alpha191.{alpha_name}")

if use_df:
# Try alpha_XXX first (preferred for DataFrame input)
func_name = f"alpha_{int(alpha_name[5:]):03d}"
if hasattr(module, func_name):
return getattr(module, func_name)
# Fallback to alphaXXX
if hasattr(module, alpha_name):
return getattr(module, alpha_name)
else:
# Try alphaXXX first (preferred for code/benchmark input)
if hasattr(module, alpha_name):
return getattr(module, alpha_name)
# Fallback to alpha_XXX
func_name = f"alpha_{int(alpha_name[5:]):03d}"
if hasattr(module, func_name):
return getattr(module, func_name)

except (ImportError, ModuleNotFoundError, ValueError):
pass

if ignore_errors:
return None
raise ValueError(f"Alpha function for '{alpha_id}' not found")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This function can be improved for clarity, robustness, and to follow best practices.

  1. Local Import: import importlib is inside the function. According to PEP 8, imports should be at the top of the file unless there's a specific reason like avoiding circular dependencies. Please move import importlib to the top of alpha191/utils.py.
  2. DRY Principle: The logic for finding the function when use_df is True or False is very similar and can be refactored to avoid repetition. You can define the search order based on the use_df flag and then iterate through it.
  3. Error Handling: The except (ImportError, ModuleNotFoundError, ValueError): pass block is too broad. It catches ValueError from format_alpha_name but then raises a generic "not found" error, hiding the original, more specific error. It's better to handle exceptions more granularly to provide clearer error messages.

Here is a suggested refactoring that addresses points 2 and 3 (please also address point 1 separately by moving the import):

def get_alpha_func(alpha_id: Union[int, str], use_df: bool = False, ignore_errors: bool = False) -> Optional[Any]:
    """
    Get the alpha function by number or name.

    Args:
        alpha_id: Alpha number (e.g., 17) or name (e.g., "alpha017")
        use_df: If True, return the function that takes a DataFrame (alpha_XXX).
               If False, return the function that takes code/benchmark (alphaXXX).
        ignore_errors: If True, return None on failure instead of raising ValueError.

    Returns:
        The alpha function or None if not found and ignore_errors is True.
    """
    try:
        alpha_name = format_alpha_name(str(alpha_id))
    except ValueError as e:
        if ignore_errors:
            return None
        raise ValueError(f"Invalid alpha name format for '{alpha_id}'") from e

    try:
        module = importlib.import_module(f"alpha191.{alpha_name}")
    except (ImportError, ModuleNotFoundError) as e:
        if ignore_errors:
            return None
        raise ValueError(f"Could not import module for alpha '{alpha_id}'") from e

    df_func_name = f"alpha_{int(alpha_name[5:]):03d}"
    code_func_name = alpha_name

    # Determine search order based on use_df flag
    search_order = [df_func_name, code_func_name] if use_df else [code_func_name, df_func_name]

    for func_name in search_order:
        if hasattr(module, func_name):
            return getattr(module, func_name)

    if ignore_errors:
        return None
    raise ValueError(f"Alpha function for '{alpha_id}' not found in module '{alpha_name}'")

Comment thread tests/test_alphas.py
Comment on lines +1020 to +1033
def test_get_stock_codes(self):
"""Test get_stock_codes lists files correctly."""
from alpha191.utils import get_stock_codes
from pathlib import Path

# This test depends on the existence of bao/hs300 directory
if Path("bao/hs300").exists():
codes = get_stock_codes("hs300")
self.assertIsInstance(codes, list)
if len(codes) > 0:
self.assertTrue(isinstance(codes[0], str))
else:
with self.assertRaises(FileNotFoundError):
get_stock_codes("nonexistent_benchmark")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This test depends on the actual file system state (the existence of the bao/hs300 directory), which makes it brittle and not fully reliable in different environments. A better practice for unit tests is to isolate them from external dependencies like the file system by using mocking.

You can use unittest.mock.patch to mock pathlib.Path and its methods (exists, glob) to simulate the presence or absence of directories and files. This will make your test deterministic and more robust, ensuring it covers both success and failure cases regardless of the environment.

For example, you could create separate test methods for the found and not-found cases using mocking:

from unittest.mock import patch, MagicMock

# ... inside your test class ...

@patch('alpha191.utils.PROJECT_ROOT')
def test_get_stock_codes_found(self, mock_project_root):
    # ... setup mock for existing directory with files ...
    # ... assert correct codes are returned ...

@patch('alpha191.utils.PROJECT_ROOT')
def test_get_stock_codes_not_found(self, mock_project_root):
    # ... setup mock for non-existent directory ...
    # ... assert FileNotFoundError is raised ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant