+ Added/refactored new custom GPTs scripts

- added gen_gpt_templ script - improved Custom GPTs template generator
2025-07-30 05:39:03 -07:00
parent 612584a66c
commit 58e8bd1e72
9 changed files with 900 additions and 110 deletions
@@ -0,0 +1,13 @@
+# TheBigPromptLibrary Tools
+
+This directory contains various tools and utilities for working with ChatGPT's custom GPTs.
+
+## Available Tools
+
+- A [collection of scripts](./openai_gpts/README.md) for managing and working with ChatGPT Custom GPTs
+
+## License
+
+These tools are open-sourced under the GNU General Public License (GPL). Under this license, you are free to use, modify, and redistribute this software, provided that all copies and derivative works are also licensed under the GPL.
+
+For more details, see the [GPLv3 License](https://www.gnu.org/licenses/gpl-3.0.html).
@@ -0,0 +1,177 @@
+# Custom GPTs scripts and tools
+
+This directory contains utilities for working with ChatGPT Custom GPTs in TheBigPromptLibrary:
+
+- **idxtool.py** - GPT indexing and searching tool
+- **gen_gpt_templ.py** - Generate markdown templates for ChatGPT GPTs by downloading and parsing their metadata
+- **oneoff.py** - One-off operations on GPT files (e.g., batch reformatting)
+
+## idxtool
+
+The `idxtool` script is a Custom GPT indexing and searching tool used in TheBigPromptLibrary.
+
+### Command line
+
+```
+usage: idxtool.py [-h] [--toc [TOC]] [--find-gpt FIND_GPT] 
+                  [--template TEMPLATE] [--parse-gptfile PARSE_GPTFILE] 
+		  [--rename]
+
+idxtool: A GPT indexing and searching tool for the CSP repo
+
+options:
+  -h, --help            show this help message and exit
+  --toc [TOC]           Rebuild the table of contents of GPT custom instructions
+  --find-gpt FIND_GPT   Find a GPT file by its ID or full ChatGPT URL
+  --template TEMPLATE   Creates an empty GPT template file from a ChatGPT URL
+  --parse-gptfile PARSE_GPTFILE
+                        Parses a GPT file name
+  --rename              Rename the GPT file names to include their GPT ID
+```
+
+### Features
+
+- Rebuild TOC: Use `--toc` to rebuild the table of contents for GPT custom instructions.
+- Find GPT File: Use `--find-gpt [GPTID or Full ChatGPT URL or a response file with IDs/URLs]` to find a GPT by its ID or URL.
+- Rename GPT: Use `--rename` to rename all the GPTs to include their GPTID as prefix.
+- Create a starter template GPT file: Use `--template [Full ChatGPT URL]` to create a starter template GPT file.
+- Help: Use `--help` to display the help message and usage instructions.
+
+### Example
+
+To rebuild the custom GPTs files, run:
+
+```bash
+python idxtool.py --toc
+```
+
+To find a GPT by its ID, run:
+
+```bash
+python idxtool.py --find-gpt 3rtbLUIUO
+```
+
+or by URL:
+  
+```bash
+python idxtool.py --find-gpt https://chat.openai.com/g/g-svehnI9xP-retro-adventures
+```
+
+Additionally, you can have a file with a list of IDs or URLs and pass it to the `--find-gpt` option:
+
+```bash
+python idxtool.py --find-gpt @gptids.txt
+```
+
+(note the '@' symbol).
+
+The `gptids.txt` file contains a list of IDs or URLs, one per line:
+
+```text
+3rtbLUIUO
+https://chat.openai.com/g/g-svehnI9xP-retro-adventures
+#vYzt7bvAm
+w2yOasK1r
+waDWNw2J3
+```
+
+## gen_gpt_templ
+
+The `gen_gpt_templ` script generates markdown templates for ChatGPT GPTs by downloading and parsing their metadata from the ChatGPT website.
+
+### Command line
+
+```bash
+usage: gen_gpt_templ.py [-h] [--debug] [--dump] [input]
+
+Generate markdown template for ChatGPT GPTs
+
+positional arguments:
+  input       GPT URL, GPT ID, g-prefixed GPT ID, or @response_file
+
+options:
+  -h, --help  show this help message and exit
+  --debug     Save debug files (HTML and dump)
+  --dump      Save parsed names and values to .txt file
+```
+
+### Features
+
+- Downloads GPT metadata from ChatGPT URLs
+- Parses GPT information including title, description, author, and profile picture
+- Generates markdown templates with GPT metadata
+- Supports multiple input formats:
+  - Full ChatGPT URL: `https://chatgpt.com/g/g-VgbIr9TQQ-ida-pro-c-sdk-and-decompiler`
+  - Conversation URL: `https://chatgpt.com/g/g-m5lMeGifF-sql-expert-querygpt/c/682cd38c-ca8c-800d-b6e2-33b8ba763824`
+  - GPT ID: `VgbIr9TQQ`
+  - Prefixed GPT ID: `g-VgbIr9TQQ`
+  - Response file: `@gptids.txt` (processes multiple GPTs from a file)
+
+### Examples
+
+Generate template for a single GPT:
+
+```bash
+python gen_gpt_templ.py https://chatgpt.com/g/g-VgbIr9TQQ-ida-pro-c-sdk-and-decompiler
+```
+
+Process multiple GPTs from a file:
+
+```bash
+python gen_gpt_templ.py @gptids.txt
+```
+
+Generate template with debug output:
+
+```bash
+python gen_gpt_templ.py g-VgbIr9TQQ --debug --dump
+```
+
+## Differences between idxtool and gen_gpt_templ
+
+### idxtool --template
+- Uses gen_gpt_templ internally to download actual GPT metadata from ChatGPT
+- Creates templates with real GPT information (title, description, author, logo)
+- Generates properly named files (`{gpt_id}.md`) without RENAMEME suffix
+- Simpler interface for basic template generation within the idxtool workflow
+
+### gen_gpt_templ
+- Full-featured standalone tool with additional capabilities:
+  - `--dump` flag to save all parsed metadata to .txt file
+  - `--debug` flag to save HTML and debug information
+  - Batch processing with @response_file for multiple GPTs
+  - More detailed console output showing download and parsing progress
+  - Can be used as a module by other tools (like idxtool)
+
+Use `idxtool --template` when you need a quick template as part of your GPT file management workflow. Use `gen_gpt_templ` directly when you need the advanced features like metadata dumping or batch processing.
+
+## oneoff
+
+The `oneoff` script performs one-off operations on GPT files, primarily batch processing tasks.
+
+### Features
+
+- **Reformat GPT files**: Reformats all GPT markdown files in a source directory and saves them to a destination directory
+- Validates GPT file structure during processing
+- Preserves GPT metadata (ID, name) during reformatting
+
+### Usage
+
+The script is designed for batch operations. Currently supports:
+
+1. **Batch reformatting**: Process all `.md` files in a source directory, reformat them according to the standard GPT markdown structure, and save to a destination directory.
+
+Example usage in code:
+
+```python
+from oneoff import reformat_gpt_files
+
+success, message = reformat_gpt_files("source_gpts/", "formatted_gpts/")
+print(message)
+```
+
+## License
+
+This tool is open-sourced under the GNU General Public License (GPL). Under this license, you are free to use, modify, and redistribute this software, provided that all copies and derivative works are also licensed under the GPL.
+
+For more details, see the [GPLv3 License](https://www.gnu.org/licenses/gpl-3.0.html).
@@ -0,0 +1,654 @@
+"""
+Generate markdown templates for ChatGPT GPTs by downloading and parsing their metadata.
+
+By Elias Bachaalany
+
+Usage:
+    gen_template.py <gpt_url|gpt_id|g-prefixed_id> [--debug]
+    gen_template.py @response_file.txt [--debug]
+"""
+import re
+import json
+import os
+import sys
+import argparse
+import requests
+from collections import namedtuple
+
+# Named tuple for generate_template return value
+GenerateTemplateResult = namedtuple('GenerateTemplateResult', 
+    ['template', 'short_url', 'gpt_id', 'parser'])
+
+# Global template string
+TEMPLATE = """GPT URL: https://chatgpt.com/g/{short_url}
+
+GPT logo: <img src="{profile_pic}" width="100px" />
+
+GPT Title: {title}
+
+GPT Description: {description} - By {author_display_name}
+
+GPT instructions:
+
+```markdown
+
+```"""
+
+# ----------------------------------------------------------
+def parse_gpt_id(url):
+    """
+    Parse the GPT ID from a ChatGPT URL
+    
+    Args:
+        url (str): Full ChatGPT URL like https://chatgpt.com/g/g-VgbIr9TQQ-ida-pro-c-sdk-and-decompiler
+    
+    Returns:
+        str or None: The GPT ID (e.g., 'VgbIr9TQQ') or None if not found
+    """
+    # Pattern to match g- followed by 9 characters
+    pattern = r'/g/g-([a-zA-Z0-9]{9})'
+    match = re.search(pattern, url)
+    
+    if match:
+        return match.group(1)
+    return None
+
+# ----------------------------------------------------------
+# Compile regex to extract streamController.enqueue arguments
+_ENQUEUE_RE = re.compile(
+    r'window\.__reactRouterContext\.streamController\.enqueue\(\s*'       # find the call
+    r'(?P<q>["\'])'                         # capture whether it's " or '
+    r'(?P<raw>(?:\\.|(?!\1).)*?)'           # any escaped-char or char not the opening quote
+    r'(?P=q)\s*'                            # matching closing quote
+    r'\)',
+    flags=re.DOTALL
+)
+
+# ----------------------------------------------------------
+def extract_enqueue_args(html_text, decode_escapes=True):
+    """
+    Scans html_text for all streamController.enqueue(...) calls,
+    returns a list of the raw string-literals inside the quotes.
+    """
+    args = []
+    for m in _ENQUEUE_RE.finditer(html_text):
+        raw = m.group('raw')
+        if decode_escapes:
+            # Only decode actual escape sequences, not Unicode characters
+            # This prevents double-encoding of emojis and other Unicode chars
+            try:
+                # First try to parse as JSON string to handle escapes properly
+                raw = json.loads('"' + raw + '"')
+            except:
+                # Fallback to simple replacement of common escapes
+                raw = raw.replace('\\n', '\n').replace('\\t', '\t').replace('\\"', '"').replace("\\'", "'").replace('\\\\', '\\')
+        args.append(raw)
+    return args
+
+
+# ----------------------------------------------------------
+class CustomGPTParser:
+    def __init__(self):
+        self._parse_cache = {}  # Cache for parsed data
+        self._parsed_items = None  # Store parsed items internally
+    
+    def parse(self, source, debug: bool = False):
+        # Determine if source is a filename or content
+        # First check if it could be a file (avoid treating content as filename)
+        is_likely_filename = (
+            len(source) < 1000 and  # Reasonable filename length
+            '|' not in source and  # Filenames don't contain pipes
+            os.path.isfile(source)
+        )
+        
+        if is_likely_filename:
+            try:
+                with open(source, encoding='utf-8') as f:
+                    content = f.read()
+            except Exception as e:
+                return (False, f"Error reading file: {e}")
+        else:
+            # Treat as content
+            content = source
+        
+        # Parse the content
+        if not (enqueue_args := extract_enqueue_args(content)):
+            msg = "No enqueue arguments found in the provided string."
+            if debug:
+                print(msg)
+            return (False, msg)
+        
+        if not enqueue_args:  # Additional safety check for empty list
+            msg = "No enqueue arguments found in the provided string."
+            if debug:
+                print(msg)
+            return (False, msg)
+        
+        try:
+            # Use the argument with the longest length (most likely the Gizmo data)
+            s = max(enqueue_args, key=len)  
+            data = json.loads(s)
+            parsed_items = []
+            for item in data:
+                if isinstance(item, dict):
+                    for k, v in item.items():
+                        parsed_items.append((k, v))
+                else:
+                    if debug:
+                        print(f"   {item}  (type: {type(item).__name__})")
+                    parsed_items.append(item)
+            
+            self._parsed_items = parsed_items
+            return (True, None)
+            
+        except json.JSONDecodeError as e:
+            return (False, f"JSON decoding error: {e}")
+    
+    def get_title(self):
+        """
+        Extract the title of the GPT by finding the item preceding 'description'.
+        
+        The algorithm walks through items to find 'description', then returns
+        the immediately preceding item as the title.
+        
+        Returns:
+            str: The title value or empty string on failure
+        """
+        # Check cache first
+        if 'title' in self._parse_cache:
+            return self._parse_cache['title']
+        
+        # Need parsed items to work with
+        if not self._parsed_items:
+            return ''
+        
+        # Convert to list if not already to allow indexing
+        items_list = list(self._parsed_items)
+        
+        # Find 'description' and get the preceding item
+        for i, item in enumerate(items_list):
+            # Skip tuples
+            if isinstance(item, tuple):
+                continue
+            
+            # Found 'description'?
+            if item == 'description' and i > 0:
+                # Get the previous item as title
+                prev_item = items_list[i - 1]
+                
+                # Make sure it's a string value, not a tuple
+                if isinstance(prev_item, str):
+                    self._parse_cache['title'] = prev_item
+                    return prev_item
+        
+        # Not found
+        return ''
+    
+    def get_author_display_name(self):
+        """
+        Extract the author display name by finding the item after 'user-{id}'.
+        
+        The pattern is:
+        - 'user_id' (literal string)
+        - 'user-{actual_user_id}' (e.g., 'user-IUwuaeXwGuwv0UoRPaeEqlzs')
+        - '{author_display_name}' (e.g., 'Elias Bachaalany')
+        
+        Returns:
+            str: The author display name or empty string on failure
+        """
+        # Check cache first
+        if 'author_display_name' in self._parse_cache:
+            return self._parse_cache['author_display_name']
+        
+        # Need parsed items to work with
+        if not self._parsed_items:
+            return ''
+        
+        # Convert to list if not already to allow indexing
+        items_list = list(self._parsed_items)
+        
+        # Find pattern: 'user_id' -> 'user-{id}' -> '{display_name}'
+        for i, item in enumerate(items_list):
+            # Skip tuples
+            if isinstance(item, tuple):
+                continue
+            
+            # Found 'user_id'?
+            if item == 'user_id' and i + 2 < len(items_list):
+                # Check if next item is a user ID (starts with 'user-')
+                next_item = items_list[i + 1]
+                if isinstance(next_item, str) and next_item.startswith('user-'):
+                    # The item after that should be the display name
+                    display_name_item = items_list[i + 2]
+                    if isinstance(display_name_item, str):
+                        self._parse_cache['author_display_name'] = display_name_item
+                        return display_name_item
+        
+        # Not found
+        return ''
+    def get_str_value(self, name: str, default: str = None):
+        """
+        Get a string value by name from the parsed items.
+        
+        Args:
+            name: The key/name to search for
+            default: Default value if not found
+            
+        Returns:
+            str: The value associated with the name or default
+        """
+        # Check cache first
+        if name in self._parse_cache:
+            return self._parse_cache[name]
+        
+        # Need parsed items to work with
+        if not self._parsed_items:
+            return default
+        
+        # Search through items
+        it = iter(self._parsed_items)
+        for item in it:
+            # Handle tuple items (key-value pairs from dictionaries)
+            if isinstance(item, tuple):
+                # We don't handle tuple items now
+                continue
+            
+            # Handle flat list items (name followed by value)
+            if item == name:
+                try:
+                    val = next(it)
+                    # Cache and return the value
+                    self._parse_cache[name] = str(val)
+                    return str(val)
+                except StopIteration:
+                    return default
+        
+        return default
+    
+    def clear_cache(self):
+        """Clear the internal cache"""
+        self._parse_cache.clear()
+    
+    def get_parsed_items(self):
+        """Get the parsed items (for backward compatibility)"""
+        return self._parsed_items if self._parsed_items else []
+    
+    def dump(self, safe_ascii=True):
+        """
+        Dump all parsed items in a formatted way.
+        
+        Args:
+            safe_ascii (bool): If True, encode non-ASCII characters safely
+        
+        Returns:
+            None (prints to stdout)
+        """
+        if not self._parsed_items:
+            print("No parsed items to dump")
+            return
+        
+        print(f"Dumping {len(self._parsed_items)} parsed items:")
+        print("-" * 60)
+        
+        for item in self._parsed_items:
+            if isinstance(item, tuple) and len(item) == 2:
+                # Handle key-value pairs from dictionaries
+                k, v = item
+                if safe_ascii:
+                    # Handle Unicode characters safely by encoding to ASCII with replacement
+                    k_safe = str(k).encode('ascii', errors='replace').decode('ascii')
+                    v_safe = str(v).encode('ascii', errors='replace').decode('ascii')
+                    print(f"   {k_safe}: {v_safe}  (type: {type(v).__name__})")
+                else:
+                    print(f"   {k}: {v}  (type: {type(v).__name__})")
+            else:
+                # Handle non-tuple items
+                if safe_ascii:
+                    # Handle Unicode characters safely for non-dict items
+                    item_safe = str(item).encode('ascii', errors='replace').decode('ascii')
+                    print(f"   {item_safe}  (type: {type(item).__name__})")
+                else:
+                    print(f"   {item}  (type: {type(item).__name__})")
+        
+        print("-" * 60)
+
+# ----------------------------------------------------------
+def download_page(url: str, out_filename: str = '') -> tuple[bool, object]:
+    """
+    Download a page using browser-like headers
+    
+    Args:
+        url (str): The full URL to download
+        out_filename (str): Optional filename to save to. If empty, no file is written.
+    
+    Returns:
+        tuple[bool, object]: (success, content/error_message)
+            - (True, content) if successful
+            - (False, error_message) if failed
+    """
+    # Ensure we have a full URL
+    if not url.startswith('http'):
+        return (False, "Please provide a full URL starting with http:// or https://")
+    
+    # Base headers from the sample request
+    headers = {
+        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:139.0) Gecko/20100101 Firefox/139.0',
+        'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
+        'Accept-Language': 'en-US,en;q=0.5',
+        # Remove Accept-Encoding to get uncompressed response
+        # 'Accept-Encoding': 'gzip, deflate, br, zstd',
+        'DNT': '1',
+        'Sec-GPC': '1',
+        'Connection': 'keep-alive',
+        'Upgrade-Insecure-Requests': '1',
+        'Sec-Fetch-Dest': 'document',
+        'Sec-Fetch-Mode': 'navigate',
+        'Sec-Fetch-Site': 'none',
+        'Sec-Fetch-User': '?1',
+        'Priority': 'u=0, i',
+        'TE': 'trailers'
+    }
+    
+    try:
+        # Create a session to handle cookies and connections properly
+        session = requests.Session()
+        session.headers.update(headers)
+        
+        # Make the GET request
+        response = session.get(url, timeout=30)
+        
+        # Check if request was successful
+        response.raise_for_status()
+        
+        # Save to file if filename provided
+        if out_filename:
+            with open(out_filename, 'w', encoding='utf-8') as f:
+                f.write(response.text)
+        
+        return (True, response.text)
+        
+    except requests.exceptions.RequestException as e:
+        return (False, f"Error downloading page: {e}")
+    except Exception as e:
+        return (False, f"Unexpected error: {e}")
+
+# ----------------------------------------------------------
+def process_gpt_input(input_str):
+    """
+    Process GPT input which can be:
+    - Full URL: https://chatgpt.com/g/g-VgbIr9TQQ-ida-pro-c-sdk-and-decompiler
+    - Conversation URL: https://chatgpt.com/g/g-m5lMeGifF-sql-expert-querygpt/c/682cd38c-ca8c-800d-b6e2-33b8ba763824
+    - GPT ID: VgbIr9TQQ
+    - Prefixed GPT ID: g-VgbIr9TQQ
+    
+    Returns:
+        tuple: (full_url, gpt_id)
+    """
+    # Check if it's a full URL
+    if input_str.startswith('https://') or input_str.startswith('http://'):
+        gpt_id = parse_gpt_id(input_str)
+        if not gpt_id:
+            raise ValueError(f"Could not parse GPT ID from URL: {input_str}")
+        
+        # If it's a conversation URL (contains /c/), extract the base GPT URL
+        if '/c/' in input_str:
+            # Extract the GPT part before /c/
+            base_url = input_str.split('/c/')[0]
+            return (base_url, gpt_id)
+        
+        return (input_str, gpt_id)
+    
+    # Check if it's a prefixed GPT ID (g-XXXXXXXXX)
+    if input_str.startswith('g-') and len(input_str) >= 11:
+        # Extract just the 9-character ID after 'g-'
+        gpt_id = input_str[2:11]  # Get exactly 9 characters after 'g-'
+        url = f"https://chatgpt.com/g/{input_str}"
+        return (url, gpt_id)
+    
+    # Assume it's a bare GPT ID (9 characters)
+    if len(input_str) == 9:
+        url = f"https://chatgpt.com/g/g-{input_str}"
+        return (url, input_str)
+    
+    raise ValueError(f"Invalid GPT input format: {input_str}")
+
+def generate_template(url, debug=False, dump=False):
+    """
+    Download and parse GPT data, then generate markdown template
+    
+    Args:
+        url: Full GPT URL
+        debug: Whether to save debug files (HTML and dump)
+        dump: Whether to print parsed items to console
+    
+    Returns:
+        tuple: (success, result_or_error)
+            - (True, GenerateTemplateResult) if successful
+            - (False, error_message) if failed
+    """
+    print(f"[DOWNLOAD] Fetching page from: {url}")
+    # Download the page
+    save_file = None
+    if debug:
+        save_file = "debug_download.html"
+        print(f"[DEBUG] Will save HTML to: {save_file}")
+    
+    success, content = download_page(url, save_file)
+    if not success:
+        return (False, f"Download failed: {content}")
+    
+    print(f"[DOWNLOAD] Successfully downloaded {len(content)} bytes")
+    
+    # Parse the content
+    print(f"[PARSE] Parsing GPT data...")
+    parser = CustomGPTParser()
+    success, error = parser.parse(content)
+    if not success:
+        return (False, f"Parsing failed: {error}")
+    
+    print(f"[PARSE] Successfully parsed {len(parser.get_parsed_items())} items")
+    
+    # Save dump if debug mode
+    if debug:
+        from io import StringIO
+        old_stdout = sys.stdout
+        sys.stdout = buffer = StringIO()
+        parser.dump(safe_ascii=True)
+        dump_content = buffer.getvalue()
+        sys.stdout = old_stdout
+        
+        dump_file = "debug_dump.txt"
+        with open(dump_file, 'w', encoding='utf-8') as f:
+            f.write(dump_content)
+        print(f"[DEBUG] Saved parsed data dump to: {dump_file}")
+    
+    # Extract required fields
+    print(f"[EXTRACT] Extracting GPT metadata...")
+    short_url = parser.get_str_value('short_url', 'UNKNOWN')
+    profile_pic = parser.get_str_value('profile_picture_url', '')
+    title = parser.get_title()
+    description = parser.get_str_value('description', '')
+    author_display_name = parser.get_author_display_name()
+    
+    print(f"[EXTRACT] Found:")
+    print(f"  - Short URL: {short_url}")
+    print(f"  - Title: {title}")
+    print(f"  - Author: {author_display_name}")
+    try:
+        print(f"  - Description: {description[:50]}..." if len(description) > 50 else f"  - Description: {description}")
+    except UnicodeEncodeError:
+        # Handle special characters that can't be printed to console
+        safe_desc = description.encode('ascii', errors='replace').decode('ascii')
+        print(f"  - Description: {safe_desc[:50]}..." if len(safe_desc) > 50 else f"  - Description: {safe_desc}")
+    print(f"  - Profile Pic: {'Yes' if profile_pic else 'No'}")
+    
+    # Dump parsed items if requested
+    if dump:
+        print("\n[DUMP] Parsed items:")
+        parser.dump(safe_ascii=False)
+    
+    # Generate template
+    template = TEMPLATE.format(
+        short_url=short_url,
+        profile_pic=profile_pic,
+        title=title,
+        description=description,
+        author_display_name=author_display_name
+    )
+    
+    # Extract GPT ID from short_url (remove 'g-' prefix if it exists)
+    gpt_id = short_url[2:] if short_url.startswith('g-') else short_url
+    
+    return (True, GenerateTemplateResult(template, short_url, gpt_id, parser))
+
+def process_response_file(filename, debug=False, dump=False):
+    """
+    Process a response file containing multiple GPT URLs/IDs
+    
+    Args:
+        filename: Path to the response file
+        debug: Whether to save debug files
+        dump: Whether to dump parsed items
+    
+    Returns:
+        tuple: (success_count, error_count)
+    """
+    try:
+        with open(filename, 'r', encoding='utf-8') as f:
+            lines = f.readlines()
+    except Exception as e:
+        print(f"Error reading response file: {e}")
+        return (0, 1)
+    
+    # Process each non-empty line
+    inputs = [line.strip() for line in lines if line.strip() and not line.strip().startswith('#')]
+    
+    if not inputs:
+        print(f"No valid inputs found in {filename}")
+        return (0, 0)
+    
+    print(f"\n{'=' * 70}")
+    print(f"PROCESSING RESPONSE FILE: {filename}")
+    print(f"Found {len(inputs)} items to process")
+    print(f"{'=' * 70}")
+    
+    success_count = 0
+    error_count = 0
+    
+    for i, input_str in enumerate(inputs, 1):
+        print(f"\n[ITEM {i}/{len(inputs)}] Processing: {input_str}")
+        print("-" * 60)
+        
+        try:
+            # Process input
+            url, gpt_id = process_gpt_input(input_str)
+            print(f"[PARSED] Full URL: {url}")
+            print(f"[PARSED] GPT ID: {gpt_id}")
+            
+            # Generate template
+            success, result = generate_template(url, debug, dump)
+            if success:
+                filename = f"{result.gpt_id}.md"
+                with open(filename, 'w', encoding='utf-8') as f:
+                    f.write(result.template)
+                print(f"[SUCCESS] Template saved to: {filename}")
+                
+                # Save dump file if requested
+                if dump:
+                    dump_filename = f"{result.gpt_id}.txt"
+                    from io import StringIO
+                    old_stdout = sys.stdout
+                    sys.stdout = buffer = StringIO()
+                    result.parser.dump(safe_ascii=True)
+                    dump_content = buffer.getvalue()
+                    sys.stdout = old_stdout
+                    
+                    with open(dump_filename, 'w', encoding='utf-8') as f:
+                        f.write(dump_content)
+                    print(f"[SUCCESS] Dump saved to: {dump_filename}")
+                
+                success_count += 1
+            else:
+                print(f"[FAILED] Error: {result}")
+                error_count += 1
+                
+        except Exception as e:
+            print(f"[ERROR] Exception: {e}")
+            error_count += 1
+    
+    print(f"\n{'=' * 70}")
+    print(f"RESPONSE FILE COMPLETE")
+    print(f"Success: {success_count}, Errors: {error_count}")
+    print(f"{'=' * 70}")
+    
+    return (success_count, error_count)
+
+def main():
+    parser = argparse.ArgumentParser(description='Generate markdown template for ChatGPT GPTs')
+    parser.add_argument('input', nargs='?', help='GPT URL, GPT ID, g-prefixed GPT ID, or @response_file')
+    parser.add_argument('--debug', action='store_true', help='Save debug files (HTML and dump)')
+    parser.add_argument('--dump', action='store_true', help='Save parsed names and values to .txt file')
+    
+    args = parser.parse_args()
+    
+    # Check if input was provided
+    if not args.input:
+        parser.print_help()
+        return 1
+    
+    try:
+        # Check if input is a response file
+        if args.input.startswith('@'):
+            # Process response file
+            filename = args.input[1:]  # Remove the @ prefix
+            success_count, error_count = process_response_file(filename, args.debug, args.dump)
+            sys.exit(0 if error_count == 0 else 1)
+        else:
+            # Process single input
+            print(f"\n[INPUT] Processing: {args.input}")
+            url, gpt_id = process_gpt_input(args.input)
+            print(f"[PARSED] Full URL: {url}")
+            print(f"[PARSED] GPT ID: {gpt_id}")
+            
+            # Generate template
+            success, result = generate_template(url, args.debug, args.dump)
+            if success:
+                # Save to file
+                filename = f"{result.gpt_id}.md"
+                with open(filename, 'w', encoding='utf-8') as f:
+                    f.write(result.template)
+                print(f"Template saved to: {filename}")
+                
+                # Save dump file if requested
+                if args.dump:
+                    dump_filename = f"{result.gpt_id}.txt"
+                    from io import StringIO
+                    old_stdout = sys.stdout
+                    sys.stdout = buffer = StringIO()
+                    result.parser.dump(safe_ascii=True)
+                    dump_content = buffer.getvalue()
+                    sys.stdout = old_stdout
+                    
+                    with open(dump_filename, 'w', encoding='utf-8') as f:
+                        f.write(dump_content)
+                    print(f"Dump saved to: {dump_filename}")
+                
+                # Also print the template
+                print("\nGenerated template:")
+                print("=" * 50)
+                try:
+                    print(result.template)
+                except UnicodeEncodeError:
+                    # Handle special characters that can't be printed to console
+                    safe_template = result.template.encode('ascii', errors='replace').decode('ascii')
+                    print(safe_template)
+            else:
+                print(f"Error: {result}")
+                return 1
+            
+    except Exception as e:
+        print(f"Error: {e}")
+        return 1
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,167 @@
+"""
+GPT parsing module.
+
+The GPT markdown files have to adhere to a very specific format described in the README.md file in the root of the CSP project.
+"""
+
+import os, re
+from collections import namedtuple
+from typing import Union, Tuple, Generator, Iterator
+
+
+GPT_BASE_URLS = ('https://chat.openai.com/g/g-', 'https://chatgpt.com/g/g-')
+GPT_BASE_URLS_L = [len(url) for url in GPT_BASE_URLS]
+FIELD_PREFIX = 'GPT'
+
+GPT_FILE_ID_RE = re.compile(r'^([0-9a-z]{9})_(.*)\.md$', re.IGNORECASE)
+"""GPT file name regex with ID and name capture."""
+GPT_FILE_VERSION_RE = re.compile(r'\[([^]]*)\]\.md$', re.IGNORECASE)
+"""GPT file name regex with version capture."""
+
+GptFieldInfo = namedtuple('FieldInfo', ['order', 'display'])
+
+GptIdentifier = namedtuple('GptIdentifier', ['id', 'name'])
+"""Description of the fields supported by GPT markdown files."""
+
+SUPPORTED_FIELDS = {
+    'url':              GptFieldInfo(10, 'URL'),
+    'title':            GptFieldInfo(20, 'Title'),
+    'description':      GptFieldInfo(30, 'Description'),
+    'logo':             GptFieldInfo(40, 'Logo'),
+    'verif_status':     GptFieldInfo(50, 'Verification Status'),
+    'instructions':     GptFieldInfo(60, 'Instructions'),
+    'actions':          GptFieldInfo(70, 'Actions'),
+    'kb_files_list':    GptFieldInfo(80, 'KB Files List'),
+    'extras':           GptFieldInfo(90, 'Extras'),
+    'protected':        GptFieldInfo(100, 'Protected'),
+}
+"""
+Dictionary of the fields supported by GPT markdown files:
+- The key should always be in lower case
+- The GPT markdown file will have the form: {FIELD_PREFIX} {key}: {value}
+"""
+
+class GptMarkdownFile:
+    """
+    A class to represent a GPT markdown file.
+    """
+    def __init__(self, fields={}, filename: str = '') -> None:
+        self.fields = fields
+        self.filename = filename
+
+    def get(self, key: str, strip: bool = True) -> Union[str, None]:
+        """
+        Return the value of the field with the specified key.
+        :param key: str, key of the field.
+        :return: str, value of the field.
+        """
+        key = key.lower()
+        if key == 'version':
+            m = GPT_FILE_VERSION_RE.search(self.filename)
+            return m.group(1) if m else ''
+
+        v = self.fields.get(key)
+        return v.strip() if strip else v
+    
+    def id(self) -> Union[GptIdentifier, None]:
+        """
+        Return the GPT identifier.
+        :return: GptIdentifier object.
+        """
+        return parse_gpturl(self.fields.get('url'))
+            
+    def __str__(self) -> str:
+        sorted_fields = sorted(self.fields.items(), key=lambda x: SUPPORTED_FIELDS[x[0]].order)
+        # Check if the field value contains the start marker of the markdown block and add a blank line before it
+        field_strings = []
+        for key, value in sorted_fields:
+            if value:
+                # Only replace the first occurrence of ```markdown
+                modified_value = value.replace("```markdown", "\r\n```markdown", 1)
+                field_string = f"{FIELD_PREFIX} {SUPPORTED_FIELDS[key].display}: {modified_value}"
+                field_strings.append(field_string)
+        return "\r\n".join(field_strings)
+
+    @staticmethod
+    def parse(file_path: str) -> Union['GptMarkdownFile', Tuple[bool, str]]:
+        """
+        Parse a markdown file and return a GptMarkdownFile object.
+        :param file_path: str, path to the markdown file.
+        :return: GptMarkdownFile if successful, otherwise a tuple with False and an error message.
+        """
+        if not os.path.exists(file_path):
+            return (False, f"File '{file_path}' does not exist.")
+
+        with open(file_path, 'r', encoding='utf-8') as file:
+            fields = {key.lower(): [] for key in SUPPORTED_FIELDS.keys()}
+            field_re = re.compile(f"^\s*{FIELD_PREFIX}\s+({'|'.join(fields.keys())}):", re.IGNORECASE)
+            current_field = None
+            for line in file:
+                if m := field_re.match(line):
+                    current_field = m.group(1).lower()
+                    line = line[len(m.group(0)):].strip()
+
+                if current_field:
+                    if current_field not in SUPPORTED_FIELDS:
+                        return (False, f"Field '{current_field}' is not supported.")
+
+                    fields[current_field].append(line)
+
+        gpt = GptMarkdownFile(
+            {key: ''.join(value) for key, value in fields.items()},
+            filename=file_path)
+        return (True, gpt)
+
+    def save(self, file_path: str) -> Tuple[bool, Union[str, None]]:
+        """
+        Save the GptMarkdownFile object to a markdown file.
+        :param file_path: str, path to the markdown file.
+        """
+        try:
+            with open(file_path, 'w', encoding='utf-8') as file:
+                file.write(str(self))
+            return (True, None)
+        except Exception as e:
+            return (False, f"Failed to save file '{file_path}': {e}")
+
+def parse_gpturl(url: str) -> Union[GptIdentifier, None]:
+    for GPT_BASE_URL, GPT_BASE_URL_L in zip(GPT_BASE_URLS, GPT_BASE_URLS_L):
+        if url and url.startswith(GPT_BASE_URL):
+            id = url[GPT_BASE_URL_L:].split('\n')[0]
+            i = id.find('-')
+            if i != -1:
+                return GptIdentifier(id[:i], id[i+1:])
+            else:
+                return GptIdentifier(id, '')
+
+
+def get_prompts_path() -> str:
+    """Return the path to the Custom GPTs prompts directory."""
+    return os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..', 'CustomInstructions', 'ChatGPT'))
+
+def enum_gpts() -> Generator[Tuple[bool, Union[GptMarkdownFile, str]], None, None]:
+    """Enumerate all the GPT files in the prompts directory, parse them and return the parsed GPT object."""
+    prompts_path = get_prompts_path()
+    for file_path in os.listdir(prompts_path):
+        _, ext = os.path.splitext(file_path)
+        if ext != '.md':
+            continue
+        file_path = os.path.join(prompts_path, file_path)
+        ok, gpt = GptMarkdownFile.parse(file_path)
+        if ok:
+            yield (True, gpt)
+        else:
+            yield (False, f"Failed to parse '{file_path}': {gpt}")
+
+def enum_gpt_files() -> Iterator[Tuple[str, str]]:
+    """
+    Enumerate all the GPT files in the prompts directory while relying on the files naming convention.
+    To normalize all the GPT file names, run the `idxtool.py --rename`
+    """
+    prompts_path = get_prompts_path()
+    for file_path in os.listdir(prompts_path):
+        m = GPT_FILE_ID_RE.match(file_path)
+        if not m:
+            continue
+        file_path = os.path.join(prompts_path, file_path)
+        yield (m.group(1), file_path)
@@ -0,0 +1,276 @@
+"""
+idxtool is a script is used to perform various GPT indexing and searching tasks
+
+- Find a GPT file by its ID or full ChatGPT URL or via a file containing a list of GPT IDs.
+- Rename all the GPTs to include their ChatGPT/g/ID in the filename.
+- Generate TOC
+- etc.
+"""
+
+import sys, os, argparse
+from typing import Tuple
+from urllib.parse import quote
+
+import gptparser
+from gptparser import enum_gpts, parse_gpturl, enum_gpt_files, get_prompts_path
+import gen_gpt_templ
+
+TOC_FILENAME = os.path.abspath(os.path.join(get_prompts_path(), '..', 'README.md'))
+TOC_GPT_MARKER_LINE = '## ChatGPT GPT instructions'
+
+def rename_gpts():
+    effective_rename = nb_ok = nb_total = 0
+
+    for ok, gpt in enum_gpts():
+        nb_total += 1
+        if not ok or not (id := gpt.id()):
+            print(f"[!] {gpt.filename}")
+            continue
+        # Skip files with correct prefix
+        basename = os.path.basename(gpt.filename)
+        if basename.startswith(f"{id.id}_"):
+            nb_ok += 1
+            continue
+        effective_rename += 1
+
+        # New full file name with ID prefix
+        new_fn = os.path.join(os.path.dirname(gpt.filename), f"{id.id}_{basename}")
+        print(f"[+] {basename} -> {os.path.basename(new_fn)}")
+        if os.system(f"git mv \"{gpt.filename}\" \"{new_fn}\"") == 0:
+            nb_ok += 1
+            continue
+
+        # If git mv failed, then try os.rename
+        try:
+            os.rename(gpt.filename, new_fn)
+            nb_ok += 1
+            continue
+        except OSError as e:
+            print(f"Rename error: {e.strerror}")
+
+    msg = f"Renamed {nb_ok} out of {nb_total} GPT files."
+    ok = nb_ok == nb_total
+    if effective_rename == 0:
+        msg = f"All {nb_total} GPT files were already renamed. No action taken."
+        print(msg)
+
+    return (ok, msg)
+
+
+def parse_gpt_file(filename) -> Tuple[bool, str]:
+    ok, gpt = gptparser.GptMarkdownFile.parse(filename)
+    if ok:
+        file_name_without_ext = os.path.splitext(os.path.basename(filename))[0]
+        dst_fn = os.path.join(
+            os.path.dirname(filename),
+            f"{file_name_without_ext}.new.md")
+        gpt.save(dst_fn)
+    else:
+        print(gpt)
+
+    return (ok, gpt)
+
+
+def rebuild_toc(toc_out: str = '') -> Tuple[bool, str]:
+    """
+    Rebuilds the table of contents GPT custom instructions file by reading all the GPT files in the CustomInstructions/ChatGPT directory.
+    """
+    if not toc_out:
+        print(f"Rebuilding Table of Contents GPT custom instructions in place")
+    else:
+        print(f"Rebuilding Table of Contents GPT custom instructions to '{toc_out}'")
+
+    toc_in = TOC_FILENAME
+    if not toc_out:
+        toc_out = toc_in
+
+    if not os.path.exists(toc_in):
+        return (False, f"TOC File '{toc_in}' does not exist.")
+
+    # Read the TOC file and find the marker line for the GPT instructions
+    out = []
+    marker_found = False
+    with open(toc_in, 'r', encoding='utf-8') as file:
+        for line in file:
+            out.append(line)
+            if line.startswith(TOC_GPT_MARKER_LINE):
+                out.append('\n')
+                marker_found = True
+                break
+    if not marker_found:
+        return (False, f"Could not find the marker '{TOC_GPT_MARKER_LINE}' in '{toc_in}'. Please revert the TOC file and try again.")
+
+    # Write the TOC file all the way up to the marker line
+    try:
+        ofile = open(toc_out, 'w', encoding='utf-8')
+    except:
+        return (False, f"Failed to open '{toc_out}' for writing.")
+
+    # Count GPTs
+    enumerated_gpts = list(enum_gpts())
+    nb_ok = sum(1 for ok, gpt in enumerated_gpts if ok and gpt.id())    
+
+    # Write the marker line and each GPT entry
+    out.append(f"There are {nb_ok} GPTs total:\n\n")
+
+    nb_ok = nb_total = 0
+    gpts = []
+    for ok, gpt in enumerated_gpts:
+        nb_total += 1
+        if ok:
+            if gpt_id := gpt.id():
+                nb_ok += 1
+                gpts.append((gpt_id, gpt))
+            else:
+                print(f"[!] No ID detected: {gpt.filename}")
+        else:
+            print(f"[!] {gpt}")
+
+    # Consistently sort the GPTs by ID and GPTs title
+    def gpts_sorter(key):
+        gpt_id, gpt = key
+        version = f"{gpt.get('version')}" if gpt.get('version') else ''
+        return f"{gpt.get('title')}{version} (id: {gpt_id.id}))"
+    gpts.sort(key=gpts_sorter)
+
+    for id, gpt in gpts:
+        file_link = f"./ChatGPT/{quote(os.path.basename(gpt.filename))}"
+        version = f" {gpt.get('version')}" if gpt.get('version') else ''
+        out.append(f"- [{gpt.get('title')}{version} (id: {id.id})]({file_link})\n")
+
+    ofile.writelines(out)
+    ofile.close()
+    msg = f"Generated TOC with {nb_ok} out of {nb_total} GPTs."
+
+    ok = nb_ok == nb_total
+    if ok:
+        print(msg)
+    return (ok, msg)
+
+def make_template(input_str, verbose=True):
+    """Creates a GPT template file from a ChatGPT URL/ID by downloading metadata"""
+    try:
+        # Process the input to handle URLs, IDs, conversation URLs, etc.
+        url, gpt_id = gen_gpt_templ.process_gpt_input(input_str)
+        
+        if verbose:
+            print(f"[PARSED] Full URL: {url}")
+            print(f"[PARSED] GPT ID: {gpt_id}")
+        
+        # Use gen_gpt_templ to generate the template with actual metadata
+        success, result = gen_gpt_templ.generate_template(url, debug=False, dump=False)
+        
+        if not success:
+            msg = f"Failed to generate template: {result}"
+            if verbose:
+                print(msg)
+            return (False, msg)
+        
+        # Extract the template content and gpt_id from the result
+        template_content = result.template
+        gpt_id = result.gpt_id
+        
+        # Save to the current working directory with the proper filename
+        filename = f"{gpt_id}.md"
+        
+        # Check if file already exists
+        if os.path.exists(filename):
+            msg = f"File '{filename}' already exists."
+            if verbose:
+                print(msg)
+            return (False, msg)
+        
+        # Write the template content
+        with open(filename, 'w', encoding='utf-8') as file:
+            file.write(template_content)
+        
+        msg = f"Created template '{filename}' for URL '{url}'"
+        if verbose:
+            print(msg)
+        return (True, msg)
+        
+    except Exception as e:
+        msg = f"Error creating template: {str(e)}"
+        if verbose:
+            print(msg)
+        return (False, msg)
+
+def find_gptfile(keyword, verbose=True):
+    """Find a GPT file by its ID or full ChatGPT URL
+    The ID can be prefixed with '@' to indicate a file containing a list of GPT IDs.
+    """
+    keyword = keyword.strip()
+    # Response file with a set of GPT IDs
+    if keyword.startswith('@'):
+        with open(keyword[1:], 'r', encoding='utf-8') as file:
+            ids = set()
+            for line in file:
+                line = line.strip()
+                # Skip comments
+                if line.startswith('#'):
+                    continue
+                # If the line is a GPT URL, then extract the ID
+                if gpt_info := parse_gpturl(line):
+                    ids.add(gpt_info.id)
+                    continue
+                # If not a GPT URL, then it's a GPT ID
+                ids.add(line)
+    elif gpt_info := parse_gpturl(keyword):
+        # A single GPT URL
+        ids = {gpt_info.id}
+    else:
+        # A single GPT ID
+        ids = {keyword}
+
+    if verbose:
+        print(f'Looking for GPT files with IDs: {", ".join(ids)}')
+    matches = []
+    for id, filename in enum_gpt_files():
+        if id in ids:
+            if verbose:
+                print(filename)
+            matches.append((id, filename))
+
+    return matches
+
+
+def main():
+    parser = argparse.ArgumentParser(description='idxtool: A GPT indexing and searching tool for the CSP repo')
+
+    parser.add_argument('--toc', nargs='?', const='', type=str, help='Rebuild the table of contents of custom GPTs')
+    parser.add_argument('--find-gpt', type=str, help='Find a GPT file by its ID or full ChatGPT URL')
+    parser.add_argument('--template', type=str, help='Creates a GPT template file from a ChatGPT URL, GPT ID, or g-prefixed ID')
+    parser.add_argument('--parse-gptfile', type=str, help='Parses a GPT file name')
+    parser.add_argument('--rename', action='store_true', help='Rename the GPT file names to include their GPT ID')
+
+    # Handle arguments
+    ok = True
+
+    args = parser.parse_args()
+    
+    # Check if no arguments were provided
+    if not any(vars(args).values()):
+        parser.print_help()
+        sys.exit(0)
+    
+    if args.parse_gptfile:
+        ok, err = parse_gpt_file(args.parse_gptfile)
+        if not ok:
+            print(err)
+    elif args.toc is not None:
+        ok, err = rebuild_toc(args.toc)
+        if not ok:
+            print(err)
+    elif args.find_gpt:
+        find_gptfile(args.find_gpt)
+    elif args.template:
+        make_template(args.template)
+    elif args.rename:
+        ok, err = rename_gpts()
+        if not ok:
+            print(err)
+
+    sys.exit(0 if ok else 1)
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,54 @@
+"""
+'oneoff.py' is a script that performs one-off operations on the GPT files
+
+- Reformat all the GPT files in the source path and save them to the destination path.
+
+"""
+
+from gptparser import GptMarkdownFile
+from typing import Tuple
+import os
+
+def reformat_gpt_files(src_path: str, dst_path: str) -> Tuple[bool, str]:
+    """
+    Reformat all the GPT files in the source path and save them to the destination path.
+    :param src_path: str, path to the source directory.
+    :param dst_path: str, path to the destination directory.
+    """
+    if not os.path.exists(src_path):
+        return (False, f"Source path '{src_path}' does not exist.")
+
+    if not os.path.exists(dst_path):
+        os.makedirs(dst_path)
+
+    print(f"Reformatting GPT files in '{src_path}' and saving them to '{dst_path}'...")
+
+    nb_ok = nb_total = 0
+    for src_file_path in os.listdir(src_path):
+        _, ext = os.path.splitext(src_file_path)
+        if ext != '.md':
+            continue
+        nb_total += 1
+        dst_file_path = os.path.join(dst_path, src_file_path)
+        src_file_path = os.path.join(src_path, src_file_path)
+        ok, gpt = GptMarkdownFile.parse(src_file_path)
+        if ok:
+            ok, msg = gpt.save(dst_file_path)
+            if ok:
+                id = gpt.id()
+                if id:
+                    info = f"; id={id.id}"
+                    if id.name:
+                        info += f", name='{id.name}'"
+                else:
+                    info = ''
+                print(f"[+] saved '{os.path.basename(src_file_path)}'{info}")
+                nb_ok += 1
+            else:
+                print(f"[!] failed to save '{src_file_path}': {msg}")
+        else:
+            print(f"[!] failed to parse '{src_file_path}': {gpt}")
+
+    msg = f"Reformatted {nb_ok} out of {nb_total} GPT files."
+    ok = nb_ok == nb_total
+    return (ok, msg)
@@ -0,0 +1 @@
+GitPython