python | rosetta euler python 005 | python params in antrl | Search

The code imports necessary libraries and sets up ANTLR directory to parse C language code, and defines a custom FunctionExtractor class to extract function names from the parsed code. A get_functions function is then defined to take a code string as input, parse it using ANTLR, and extract function names using the FunctionExtractor class.

Run example

npm run import -- "list c functions with python"

list c functions with python

import os
from antlr4 import *
ANTLR_DIRECTORY = os.path.join(os.path.dirname(__file__), "../Resources/Parsers/c")
import sys
sys.path.append(ANTLR_DIRECTORY)
from CLexer import CLexer
from CParser import CParser
from CListener import CListener  # ANTLR-generated listener class

class FunctionExtractor(CListener):
    def __init__(self):
        self.functions = []

    def enterFunctionDefinition(self, ctx):
        function_name = ctx.declarator().directDeclarator().getText()
        self.functions.append(function_name)

def get_functions(code_string):
    lexer = CLexer(InputStream(code_string))
    stream = CommonTokenStream(lexer)
    parser = CParser(stream)
    tree = parser.compilationUnit()
    
    listener = FunctionExtractor()
    walker = ParseTreeWalker()
    walker.walk(listener, tree)
    
    return listener.functions

__all__ = {
  "get_functions": get_functions
}

What the code could have been:

import os
import sys
from antlr4 import *

ANTLR_DIRECTORY = os.path.join(os.path.dirname(__file__), "../Resources/Parsers/c")
sys.path.append(ANTLR_DIRECTORY)

from CLexer import CLexer
from CParser import CParser
from CListener import CListener

class FunctionExtractor(CListener):
    """
    A listener class for extracting function definitions from C code.

    Attributes:
        functions (list): A list of extracted function names
    """

    def __init__(self):
        """
        Initializes the FunctionExtractor instance.
        """
        self.functions = []

    def enterFunctionDefinition(self, ctx):
        """
        Called when entering a function definition.

        Args:
            ctx (ParserRuleContext): The current parsing context
        """
        function_name = ctx.declarator().directDeclarator().getText()
        self.functions.append(function_name)

def get_functions(code_string):
    """
    Extracts function definitions from a given C code string.

    Args:
        code_string (str): The C code string to parse

    Returns:
        list: A list of extracted function names
    """
    if not code_string.strip():  # Check for empty string
        return []

    try:
        lexer = CLexer(InputStream(code_string))
        stream = CommonTokenStream(lexer)
        parser = CParser(stream)
        tree = parser.compilationUnit()
        
        listener = FunctionExtractor()
        walker = ParseTreeWalker()
        walker.walk(listener, tree)
        
        return listener.functions
    
    except Exception as e:
        # Log or handle exception for real-time information
        print(f"Error parsing code: {e}")
        return []

__all__ = {
  "get_functions": get_functions
}

Code Breakdown

Importing Libraries

The code begins by importing necessary libraries:

import os
from antlr4 import *

Setting Up ANTLR Directory

It then sets up the directory for ANTLR (ANother Tool for Language Recognition) tools:

ANTLR_DIRECTORY = os.path.join(os.path.dirname(__file__), "../Resources/Parsers/c")
import sys
sys.path.append(ANTLR_DIRECTORY)

This adds the ANTLR directory to the system path, allowing the code to access ANTLR-generated classes.

Importing ANTLR-Generated Classes

The code then imports ANTLR-generated classes for a C language parser:

from CLexer import CLexer
from CParser import CParser
from CListener import CListener  # ANTLR-generated listener class

Defining a Function Extractor Class

A custom class FunctionExtractor is defined, implementing the CListener interface:

class FunctionExtractor(CListener):
    def __init__(self):
        self.functions = []

    def enterFunctionDefinition(self, ctx):
        function_name = ctx.declarator().directDeclarator().getText()
        self.functions.append(function_name)

This class listens for enterFunctionDefinition events and extracts function names from the context.

Defining a Function Extractor Function

A function get_functions is defined, which takes a code string as input, parses it using ANTLR, and extracts function names using the FunctionExtractor class:

def get_functions(code_string):
    lexer = CLexer(InputStream(code_string))
    stream = CommonTokenStream(lexer)
    parser = CParser(stream)
    tree = parser.compilationUnit()
    
    listener = FunctionExtractor()
    walker = ParseTreeWalker()
    walker.walk(listener, tree)
    
    return listener.functions

Exporting the Function

The get_functions function is exported as a module attribute:

__all__ = {
  "get_functions": get_functions
}

This allows the function to be imported and used by other modules.