Skip to content

🧙‍♂️ MagicXML is a FastAPI-based service designed to fetch, process, and convert XML data into structured CSV files. It is optimized for handling large XML files by processing them in chunks asynchronously, making it suitable for heavy data processing tasks.

License

Notifications You must be signed in to change notification settings

Solrikk/MagicXML

Repository files navigation


MagicXML 🧙‍♂️📜

Overview

MagicXML is a web application built with FastAPI that allows users to submit URLs pointing to XML files, processes the content, and converts the data into CSV format. The application supports asynchronous processing, ensuring high performance when handling large volumes of data.

🚀 Features

  • Asynchronous Processing: Efficiently fetches and processes XML data in chunks using asyncio and aiohttp, ensuring scalability and high performance.
  • Customizable XML Parsing: Tailored to handle specific XML structures, extracting and cleaning data as required.
  • Data Cleaning and Sanitization: Removes unwanted HTML tags and special characters from descriptions and names to ensure data integrity.
  • CSV Export: Converts processed XML data into well-structured CSV files, supporting various encoding standards.
  • REST API Interface: Provides simple API endpoints to trigger processing and retrieve files programmatically.
  • Error Handling: Implements robust error management to capture and report issues during XML processing.
  • CORS Support: Allows requests from any origin, facilitating integration with other services and applications.
  • Processing Status Tracking: Enables users to check the status of their processing tasks using a preset_id.

🛠️ Installation

  • Python 3.8+
  • FastAPI: A modern, fast (high-performance), web framework for building APIs with Python 3.6+.
  • aiohttp: An asynchronous HTTP client/server framework.
  • aiofiles: A library for handling local file operations asynchronously.
  • Jinja2: A templating engine for Python.

API Usage Example

To use the API, you can send a POST request to the /process_link endpoint with the necessary parameters. Below is an example using curl:

curl -X 'POST' \
  'https://solarxml.replit.app//process_link' \
  -H 'Content-Type: application/json' \
  -d '{"link_url": "YOUR_XML_URL", "preset_id": "id=1234"}' \
  -o process_response.json

Replace YOUR_XML_URL with the actual URL of the XML data you want to process. This request will save the response in a file named process_response.json.

image

Clone the Repository

git clone https://github.com/Solrikk/MagicXML.git
cd MagicXML

Install Dependencies

You can install the required dependencies using pip: pip install -r requirements.txt

About

🧙‍♂️ MagicXML is a FastAPI-based service designed to fetch, process, and convert XML data into structured CSV files. It is optimized for handling large XML files by processing them in chunks asynchronously, making it suitable for heavy data processing tasks.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published