Skip to content

View all features
- BY COMPANY SIZE
  Enterprises
  Small and medium teams
  Startups
  Nonprofits
- BY USE CASE
  App Modernization
  DevSecOps
  DevOps
  CI/CD
  View all use cases
- BY INDUSTRY
  Healthcare
  Financial services
  Manufacturing
  Government
  View all industries
View all solutions
- EXPLORE BY TOPIC
  AI
  Software Development
  DevOps
  Security
  View all topics
- EXPLORE BY TYPE
  Customer stories
  Events & webinars
  Ebooks & reports
  Business insights
  GitHub Skills
- SUPPORT & SERVICES
  Documentation
  Customer support
  Community forum
  Trust center
  Partners
View all resources
- COMMUNITY
  GitHub SponsorsFund open source developers
- PROGRAMS
  Security Lab
  Maintainer Community
  Accelerator
  GitHub Stars
  Archive Program
- REPOSITORIES
  Topics
  Trending
  Collections
- ENTERPRISE SOLUTIONS
  Enterprise platformAI-powered developer platform
- AVAILABLE ADD-ONS
  GitHub Advanced SecurityEnterprise-grade security features
  Copilot for BusinessEnterprise-grade AI features
  Premium SupportEnterprise-grade 24/7 support
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Sign in

Sign up

Appearance settings

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

dweng0 / PyLex Public

forked from CodeCrafter-Guy/PyLex

Notifications You must be signed in to change notification settings
Fork 0
Star 1

Code
Issues 1
Pull requests
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security and quality
Insights

Breadcrumbs

PyLex

/

BDD_STATUS.md

Copy path

More file actions

Latest commit

History

104 lines (69 loc) · 3.41 KB

Breadcrumbs

PyLex

/

BDD_STATUS.md

File metadata and controls

104 lines (69 loc) · 3.41 KB

Raw

Copy raw file

Download raw file

Outline

Edit and raw actions

BDD Status

Checked 49 scenario(s) across 41 test file(s).

Feature: Tokenise source code

Tokenise a Python hello-world file
Tokenise a JavaScript hello-world file
Tokenise a TypeScript hello-world file
Tokenise a Rust hello-world file
Tokenise a C++ hello-world file
Tokenise a Fortran hello-world file
Tokenise a Vyper hello-world file

Feature: JSON output format

Output is a valid JSON array
Token types are non-empty strings
Concatenating token values reconstructs the original input
Unrecognised characters are reported on stderr not stdout

Feature: Token type identification

Keywords are identified as keyword tokens
Identifiers are identified as identifier tokens
Whitespace is preserved as whitespace tokens
String literals are identified as string literal tokens
Operators are identified as operator tokens
Keywords are not misidentified as identifiers

Feature: Comprehensive language tokenisation

Tokenise a Python file with function definition and control flow
Tokenise a JavaScript file with variable declarations and arrow functions
Tokenise a Rust file with struct and impl definitions

Feature: CLI error handling

Missing command-line arguments prints usage and exits non-zero
Input file not found exits with a clear error message
Invalid YAML lexer config exits with a clear error message

Feature: Lexer schema validation

Valid lexer YAML passes validation
Lexer YAML missing required field fails validation
Lexer YAML with a token missing both value and pattern fails validation
All bundled lexer files pass validation

Feature: Custom lexer configuration

A custom lexer tokenises a simple DSL

Feature: Comment tokenisation

Single-line comments are tokenised as comment tokens
Multi-line comments are tokenised as comment tokens

Feature: Import statement tokenisation

Python import statement is tokenised correctly
JavaScript import statement is tokenised correctly
Rust use statement is tokenised correctly

Feature: Multi-character operator tokenisation

Equality operator is tokenised as a single token
Arrow operator is tokenised as a single token
Compound assignment operators are tokenised as single tokens

Feature: Number literal tokenisation

Integer literals are tokenised as number tokens
Float literals are tokenised as number tokens
Hexadecimal literals are tokenised as number tokens

Feature: Empty input handling

Empty file produces an empty token array
Whitespace-only file produces only whitespace tokens

Feature: Duplicate token deduplication

Duplicate value-based tokens are rejected during validation
Bundled lexers contain no duplicate token values

Feature: Regex pattern precompilation

Patterns are compiled once before tokenisation begins
Large file tokenisation completes within a reasonable time

Feature: Efficient string slicing

String slice extraction uses direct slicing instead of character-by-character concatenation

Feature: CLI error messages

File not found produces a clean error message without a traceback
Invalid YAML produces a clean error message without a traceback
Unreadable file produces a clean error message without a traceback

49/49 scenarios covered.

Footer

Footer navigation

Terms
Privacy
Security
Status
Community
Docs
Contact

You can’t perform that action at this time.

FilesExpand file tree

BDD_STATUS.md

Latest commit

History

BDD_STATUS.md

File metadata and controls

BDD Status

Feature: Tokenise source code

Feature: JSON output format

Feature: Token type identification

Feature: Comprehensive language tokenisation

Feature: CLI error handling

Feature: Lexer schema validation

Feature: Custom lexer configuration

Feature: Comment tokenisation

Feature: Import statement tokenisation

Feature: Multi-character operator tokenisation

Feature: Number literal tokenisation

Feature: Empty input handling

Feature: Duplicate token deduplication

Feature: Regex pattern precompilation

Feature: Efficient string slicing

Feature: CLI error messages