Threat Report: Exposing Malware in Linux-Based Multi-Cloud Environments | Download Now

EDR: How to Utilize Tokenization for Cmdline Searches

EDR: How to Utilize Tokenization for Cmdline Searches


  • EDR Console: All Supported Versions


How to utilize command line tokenization to help create better queries that are easier to write and understand


Command lines are tokenized with the default cbeventsv2 schema in Solr. Tokenization is a way of breaking up the command into smaller chunks that can be searched individually.

For example, let's use this command line to see how tokenization is broken up and how it can be searched against

"C:\Windows\Microsoft.NET\Framework64\v4.0.30319\csc.exe" /noconfig
Tokenization breaks this up into smaller pieces based on last dot, backslashes, spaces, parentheses and other special characters. This command line would be broken up like the following

Instead of search by the full command line like this
cmdline:"\"C:\\Windows\\Microsoft.NET\\Framework64\\v4.0.30319\\csc.exe\" /noconfig"
We can simply it to a to some specifics using tokenization. Here's some examples
In our example command line we know that the version will change. We also know that Framework64 path could just be Framework. Wildcards would not work here, no results would come back. So, how can we search this without a wildcard? It's simple, we can split a command line search up with ANDs. Notice we have the two possible Framework* examples within parentheses while using an OR, this is in place of using a wildcard. 
((cmdline:"C:\\Windows\\Microsoft.NET\\Framework64" OR cmdline:"C:\\Windows\\Microsoft.NET\\Framework") AND cmdline:"csc.exe" AND cmdline:"/noconfig")
  • The command line I have is very complicated, how can I better see how it is tokenized? You have two options
    1. Solr Dashboard (On-prem only)
      1. To access the Solr dashboard, you need an endpoint that has been provided firewall access to port 8080 to access the page over a web browser, please work with your admin to provide that access. It's recommended to disable the access after you are done. Port 8080 should not be accessible outside localhost/minions for normal operation. 
      2. Type http://<fqdn or ip of server>:8080/solr/#/reader/analysis
      3. In the "Field Value" type/paste your command line. Select cmdline in the Analyse Fieldname drop down
      4. Uncheck "verbose output" and click to Analyse values.
      5. Between the Pipes are the individual tokenizations of your command line.
    2. CbAPI. This will require you to setup CbAPI on a machine with Python. Getting Started with CbAPI  
      1. Try using the CbAPI Python script in the additional notes section
      2. This script allows you to search two ways. By the unique id of the process found in the Process Analysis page, or by manually entering the cmdline
        • By Unique ID (use the - i switch with the unique id). This will automatically pull the cmdline associated with the document and utilize the correct OS format.
          python.exe -i 00000018-0000-0a38-01d8-bd46e27cea2f-018318fee86a
          Tokenizing Command Line:
           %SystemRoot%\system32\csrss.exe ObjectDirectory=\Windows SharedSection=1024,20480,768 Windows=On SubSystemType=Windows ServerDll=basesrv,1 ServerDll=winsrv:UserServerDllInitialization,3 ServerDll=sxssrv,4 ProfileControl=Off MaxRequestThreads=16
          Tokenized Command Line:
           ['%SystemRoot%', 'system32', 'csrss.exe', '.exe', 'ObjectDirectory', 'Windows', 'SharedSection', '1024', '20480', '768', 'Windows', 'On', 'SubSystemType', 'Windows', 'ServerDll', 'basesrv', '1', 'ServerDll', 'winsrv', 'UserServerDllInitialization', '3', 'ServerDll', 'sxssrv', '4', 'ProfileControl', 'Off', 'MaxRequestThreads', '16']

        • By manual command line (use -c to use manual, if the cmdline is macOS or Linux, include the -o switch also)
          python.exe -c "\"C:\\Program Files (x86)\\Google\\Update\\Install\\{023FE676-E24A-4EA1-A3F5-C2844B126DEF}\\CR_8DF3F.tmp\\setup.exe\" --install-archive=\"C:\\Program Files (x86)\\Google\\Update\\Install\\{023FE676-E24A-4EA1-A3F5-C2844B126DEF}\\CR_8DF3F.tmp\\CHROME_PATCH.PACKED.7Z\" --previous-version=\"104.0.5112.81\" --verbose-logging --do-not-launch-chrome --channel=stable --system-level"
          Tokenizing Command Line:
           "C:\\Program Files (x86)\\Google\\Update\\Install\\{023FE676-E24A-4EA1-A3F5-C2844B126DEF}\\CR_8DF3F.tmp\\setup.exe" --install-archive="C:\\Program Files (x86)\\Google\\Update\\Install\\{023FE676-E24A-4EA1-A3F5-C2844B126DEF}\\CR_8DF3F.tmp\\CHROME_PATCH.PACKED.7Z" --previous-version="104.0.5112.81" --verbose-logging --do-not-launch-chrome --channel=stable --system-level
          Tokenized Command Line:
           ['C:', 'Program', 'Files', 'x86', 'Google', 'Update', 'Install', '023FE676-E24A-4EA1-A3F5-C2844B126DEF', 'CR_8DF3F.tmp', '.tmp', 'setup.exe', '.exe', '--install-archive', 'C:', 'Program', 'Files', 'x86', 'Google', 'Update', 'Install', '023FE676-E24A-4EA1-A3F5-C2844B126DEF', 'CR_8DF3F.tmp', '.tmp', 'CHROME_PATCH.PACKED.7Z', '.7Z', '--previous-version:', '104.0.5112.81', '.81', '--verbose-logging', '--do-not-launch-chrome', '--channel:', 'stable', '--system-level']

Additional Notes script
  • Note: This script cannot fully represent how Solr tokenizes some special circumstances, however it is a best effort to replicate this to assist you in creating a query.
    import sys, os, re
    from cbapi.response.models import Process
    from cbapi.response import CbResponseAPI
    from cbapi.errors import ObjectNotFoundError
    from cbapi.example_helpers import build_cli_parser, get_cb_response_object
    This API script is used to help understand the tokenization of a cmdline
    if you do not have access to the Solr dashboard. 
    def get_cmdline(cb, unique_id):
        '''This is the API request to get the commandline of a process document by id
        Grab the URL from the process analysis page in the console
        Take the unique id, in this example "00000014-0000-02d0-01d8-bbd321ab8fbe" when using the -i switch
        proc ="process_id:{}".format(unique_id)).first()
        tokenize_cmd(proc.cmdline, proc.os_type)
    def tokenize_cmd(cmdline, os_type='windows'):
        space_list = []
        dot_list = []
        colon_list = []
        end_list = []
        print(f'\nTokenizing Command Line:\n {cmdline}\n')
        if 'windows' in os_type.lower():
            '''API converts to single backslash, we need to do the same for manual input'''
            cmdline_rr = cmdline.replace('\\\\','\\')
            split_cmdline = cmdline_rr.split(os.sep)
            split_cmdline = cmdline.split('/')
        for x in split_cmdline:
            if ' ' in x: 
                y = x.split()
                for z in y:
        for x in space_list:
            if '.' in x: 
                y = x.rsplit('.', 1)
                end = '.'+y[-1]
        for x in dot_list:
            if ':' in x and not 'C:' in x or '=' in x or ',' in x:
                if ':' in x[-1]: end = True
                y = re.split(r'=|,|:|/', x)
                if end is True: y[-2] = y[-2]+":"
                for z in y:
                    if z != '':
        for x in colon_list:
            s = ''.join(c for c in x if c not in '#&|\\[]{}();,<>=\'\"')
        print(f'\nTokenized Command Line:\n {end_list}\n')
    def main():
        cb = CbResponseAPI()
        You have the option of having the API pull the cmdline for you based on uniqueid, see note above for getting the unique id -i 00000014-0000-02d0-01d8-bbd321ab8fbe
        - id switch will pick up the os version automatically
        If you have a cmdline, use -c. for non-windows cmdlines, include the -o switch
        examples: -c "/usr/lib/firefox/firefox" -o -c "\"C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe\" -c "\"C:\Program Files\Google\Chrome\Application\chrome.exe\"
        parser = build_cli_parser(description="Commandline Tokenization")
        parser.add_argument("--id", "-i", dest="id", help="Unique ID of Process")
        parser.add_argument("--cmd", "-c", dest="cmd", help="Manual cmdline")
        parser.add_argument("--unix", "-u", dest="os_type", help="OS is not Windows", action='store_true')
        args = parser.parse_args()
        cb = get_cb_response_object(args)
            return get_cmdline(cb,
        if args.cmd and args.os_type:
            return tokenize_cmd(args.cmd, 'unix')
        if args.cmd:
            return tokenize_cmd(args.cmd)
    if __name__ == "__main__":
  • Where can the Unique ID be found? 
    • Select your process in question, it will send you to the process analysis page.
    • The URL will look something like this: 
    • Take the unique id, in this example "00000014-0000-02d0-01d8-bbd321ab8fbe" when using the -i switch


Labels (1)
Tags (2)
Was this article helpful? Yes No
No ratings
Article Information
Creation Date: