www.xbdev.net
xbdev - software development
Friday May 9, 2025
Home | Contact | Support | PHP... a powerful, flexible, fully supported, battle tested server side language ..
     
 

PHP...

a powerful, flexible, fully supported, battle tested server side language ..

 

C to HTML Converter (PHP Version)

by bkenwright@xbdev.net



Converting code into 'colored' HTML isn't as simple as you'd think - as you have to identify which parts of the code are comments, keywords and other delimeters.

Different languages have different criteria/syntax - so one size does not fit all.

However, a popular language style - is the 'c-style' language - which is used in 'JavaScript', 'C#', 'C++', 'GLSL' etc. The keywords and syntax isn't identical - but they're very similar (e.g., single and multiline comments, function/variable declaration...)

Writing a code parser is similar to writing a compiler - you have to convert the text into tokens - then go token by token to analyse what that token means - such as, is it a string, is it a comment and so on.

The following provides a complete PHP implementation for converting C files (and other C-style code) to color vanilla HTML.

In fact, the actual code shown below was generated using the php implementation (c2html.php) - so you can see how it looks when generating 'php' type syntax as well.

C2HTML PHP Code


<?php
/********************************************************************************/
/*                                                                              */
/*  File:   c2html.php                                                          */
/*  Auth:   Ben Kenwright                                                       */
/*  Email:  bkenwright@xbdev.net                                                */
/*  Url:    www.xbdev.net                                                       */
/*  Date:   01/01/06                                                            */
/*                                                                              */
/********************************************************************************/
/*
    About?
    Well its such a simple program, and I'm sure everyone has wrote one, one time
    or another.
    Convert our code, c/c++ so that its color coded!  This code takes it a bit
    further and generates a html file, so you can take your c/c++ text file
    and output a html file which will have all the great colour coding!

    I've kept it simple, as you can mix the html so it uses styles and .css files
    but I like to use the font color html tag....but its very easy to convert
    to the alternative method if you prefair.

    To use?
    // load file (get code)
    $content = file_get_contents($filename);
    // convert it to html using function
    $html = convert2html($content);


    Further work / Thinking:
    o Bit of modification build website c/c++ color coding.
*/
/********************************************************************************/

define("ZCOMMENT", "<font color=\"navy\">");
define("ZSTRING", "<font color=\"green\">");
define("ZMACRO", "<font color=\"violet\">");
define("ZKEYWORDS", "<font color=\"maroon\">");
define("ZDECLARATIONS", "<font color=\"maroon\">");
define("ZUNIQUE", "<font color=\"red\">");

//--------------------------------------------------------------------------------

function parse($content) {
    $output = '';
    $length = strlen($content);
    $pos = 0;
    
    while ($pos < $length) {
        $c = $content[$pos];
        $pos++;
        
        // Start of a comment
        if ($c == '/') {
            if ($pos >= $length) {
                return $output;
            }
            
            $next_char = $content[$pos];
            $pos++;
            
            if ($next_char == '/' || $next_char == '*') {
                $comment_type = $next_char;
                $output .= ZCOMMENT . "/";
                $output .= put_char($next_char);
                
                $prev_char = '';
                while ($pos < $length) {
                    $current_char = $content[$pos];
                    $pos++;
                    $output .= put_char($current_char);
                    
                    if ($comment_type == '/' && $current_char == "\n") {
                        break;
                    } elseif ($comment_type == '*' && $prev_char == '*' && $current_char == '/') {
                        break;
                    }
                    
                    $prev_char = $current_char;
                }
                
                $output .= "</font>";
            } else {
                $pos--;
                $output .= "/";
            }
        } 
        elseif ($c == '\'' || $c == '"') {
            // Quotation
            $quote = $c;
            $back_slash = false;
            $output .= ZSTRING;
            $output .= put_char($c);
            
            while ($pos < $length) {
                $current_char = $content[$pos];
                $pos++;
                $output .= put_char($current_char);
                
                if ($current_char == $quote && !$back_slash) {
                    break;
                }
                
                if ($current_char == '\' && !$back_slash) {
                    $back_slash = true;
                } else {
                    $back_slash = false;
                }
            }
            
            $output .= "</font>";
        } 
        elseif ($c == '#') {
            // Start of a macro
            $output .= ZMACRO;
            $output .= put_char($c);
            
            // Skip whitespace
            while ($pos < $length) {
                $current_char = $content[$pos];
                if (ctype_space($current_char)) {
                    $pos++;
                    $output .= put_char($current_char);
                } else {
                    break;
                }
            }
            
            $buffer = '';
            while ($pos < $length) {
                $current_char = $content[$pos];
                if (ctype_alpha($current_char)) {
                    $buffer .= $current_char;
                    $pos++;
                } else {
                    break;
                }
            }
            
            if (is_macro($buffer)) {
                $output .= "$buffer</font>";
            } else {
                $output .= "</font>$buffer";
            }
            
            if ($pos < $length) {
                $output .= put_char($content[$pos]);
                $pos++;
            }
        } 
        else {
            if (ctype_lower($c) || ctype_alnum($c) || $c == '_') {
                $buffer = $c;
                
                while ($pos < $length) {
                    $current_char = $content[$pos];
                    if (ctype_lower($current_char) || ctype_alnum($current_char) || $current_char == '_') {
                        $buffer .= $current_char;
                        $pos++;
                    } else {
                        break;
                    }
                }
                
                $output .= is_token($buffer);
                
                if ($pos < $length) {
                    $output .= put_char($content[$pos]);
                    $pos++;
                }
            } else {
                $output .= put_char($c);
            }
        }
    }
    
    return $output;
}// End Parse(..)

//--------------------------------------------------------------------------------

function put_char($c) { 
    $restul = '';
    switch ($c) {
        case '<':
            $result = "&lt;";
            break;
        case '>':
            $result = "&gt;";
            break;
        case '&':
            $result = "&amp;";
            break;
        case "\t":
            $result = "    ";
            break;
        default:
            $result = $c;
            break;
    }
    
    return $result;
}// End put_char(..)

//--------------------------------------------------------------------------------

function is_token($buffer) {
    $result = '';
    if (is_keyword($buffer)) {
        $result = ZKEYWORDS . $buffer . "</font>";
    } elseif (is_decl($buffer)) {
        $result = ZDECLARATIONS . $buffer . "</font>";
    } elseif (is_uniq($buffer) || is_number($buffer)) {
        $result = ZUNIQUE . $buffer . "</font>";
    } else {
        $result = $buffer;
    }
    
    return $result;
}// End is_token(..)

//--------------------------------------------------------------------------------

function is_keyword($buffer) {
    $keywords = array(
        "break", "case", "continue", "default", "do", "else", "for",
        "goto", "if", "return", "sizeof", "switch", "while"
    );
    
    return in_array($buffer, $keywords);
}// End is_keyword(..)

//--------------------------------------------------------------------------------

function is_decl($buffer) {
    $declarations = array(
        "auto", "char", "const", "DIR", "double", "enum", "extern",
        "FILE", "float", "fpos_t", "int", "int8_t", "int16_t",
        "int32_t", "int64_t", "long", "mode_t", "pid_t", "register",
        "short", "signed", "size_t", "ssize_t", "static", "struct",
        "typedef", "union", "unsigned", "va_list", "void", "volatile",
        "class", "public", "protected", "private"
    );
    
    return in_array($buffer, $declarations);
}// End is_decl(..)

//--------------------------------------------------------------------------------

function is_uniq($buffer) {
    $unique = array(
        "__DATE__", "__TIME__", "EACCES", "EAGAIN", "EBADF", 
        "EBUSY", "EOF", "ECHILD", "EDEADLK", "EDOM", 
        "EFAULT", "EINVAL", "EILSEQ", "EINTR", "EFBIG", 
        "EISDIR", "stdin", "EMFILE", "EMLINK", "EMSGSIZE",
        "ENFILE", "ENODEV", "ENOENT", "ENOLCK", "stdout",
        "ENOMEM", "ENOTDIR", "ENOSPC", "ENOSYS", "ENOTEMPTY",
        "ENOTSUP", "ENOTTY", "ENOEXEC", "ENXIO", "ECANCELED",
        "EPIPE", "ERANGE", "EROFS", "ESPIPE", "ESRCH", 
        "EXDEV", "__FILE__", "__LINE__", "NULL", "SEEK_SET",
        "SEEK_CUR", "SEEK_END", "SIGABRT", "SIGALRM", "SIGCHLD",
        "SIGCONT", "SIG_DFL", "SIG_ERR", "SIGHUP", "SIG_IGN", 
        "SIGINT", "SIGFPE", "SIGKILL", "SIGQUIT", "SIGSEGV", 
        "SIGSTP", "SIGTERM", "SIGTRAP", "SIGTTIN", "SIGTTOU",
        "SIGUSR1", "SIGUSR2", "__STDC__", "stderr", "EINPROGRESS",
        "E2BIG", "EBADMSG", "EEXIST", "EIO", "ENAMETOOLONG",
        "SIGILL", "EPERM", "SIGSTOP", "ETIMEDOUT",
    );
    
    return in_array($buffer, $unique);
}// End is_uniq(..)

//--------------------------------------------------------------------------------

function is_number($buffer) {
    if (!preg_match('/^[0-9a-fA-FxX]+$/', $buffer)) {
        return false;
    }
    
    if (strlen($buffer) > 1 && $buffer[0] == '0' && ($buffer[1] == 'x' || $buffer[1] == 'X')) {
        // Hex number
        return ctype_xdigit(substr($buffer, 2));
    }
    
    return ctype_digit($buffer);
}// End is_number(..)

//--------------------------------------------------------------------------------

function is_macro($buffer) {
    $macros = array(
        "define", "elif", "else", "endif", "error", "if", 
        "ifdef", "ifndef", "include", "line", "pragma"
    );
    
    return in_array($buffer, $macros);
}// End is_macro(..)

//--------------------------------------------------------------------------------

function page_head() {
    echo "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\">\n";
    echo "<html>\n";
    echo "<head>\n";
    echo "<title>www.xbdev.net c2html demo</title>\n";
    echo "</head>\n";
    echo "<body>\n\n";
}// End page_head(..)

//--------------------------------------------------------------------------------

function page_foot() {
    echo "\n\n</body>\n";
    echo "</html>\n";
}// End page_foot(..)

//--------------------------------------------------------------------------------

function file_start() {
    echo "<pre>\n";
}// End file_start()

//--------------------------------------------------------------------------------

function file_end() {
    echo "</pre>\n";
    echo "<hr />\n";
}// End file_end()

/********************************************************************************/
/*                                                                              */
/*  Program Entry Point                                                         */
/*                                                                              */
/********************************************************************************/

function convert2html($content)
{
    page_head();
    file_start();
    $cc = parse($content);
    echo( $cc );
    file_end();
    page_foot();
}// End convert2html(..)

//--------------------------------------------------------------------------------
?>


Things to Try


A few ideas for you to explore if you're interested in taking this implementation futher:

• Additional styles and colors
• Options to select the 'style' (different color options)
• Use 'styles' instead of hard coded elements (easier to update and maintain) - also the future direction of the web.
• Add tooltip or hover over options - hover over blocks of code - you can add a 'popup' tooltip?
• Add line numbers down the side
• Explore different fonts/layouts to make the code more readable


Resources & Links


• Based on C2HTML Version in C++ [LINK]

• Live Exampel of the Code [LINK]







 
Advert (Support Website)

 
 Visitor:
Copyright (c) 2002-2025 xbdev.net - All rights reserved.
Designated articles, tutorials and software are the property of their respective owners.