Large Language Models for Computer Aided Design
Organisations involved
Main Participant: SSC is a German IT service provider, specialising in secure data exchange and management solutions for industrial collaboration.
Technology provider: Karlsruhe Institute of Technology is one of Europe’s leading institutions for research, teaching, and innovation at the interface of science and industry.KIT provides assistance in research and technology.
The challenge
2D CAD files are widely used across the industry and are often archived as images, making it difficult to retrieve structured information. While CAD images contain valuable data in graphical elements and embedded annotations, this information is not readily accessible for automated processing or search.
Conventional Optical Character Recognition (OCR) systems have clear limitations when applied to CAD drawings. They struggle to accurately extract tabular information and annotations in complex or dense layouts, meaning even simple queries cannot be handled efficiently. The wide variety of CAD file formats further complicates extraction, while sensitive engineering data imposes strict data protection and sovereignty constraints.
SSC operates an established data exchange platform widely used in the automotive industry. However, when engineers collaborate and exchange CAD data, information from drawings is still commonly generated manually. This process is time-consuming, error-prone, and does not scale.
SSC identified that visual language models could automate information extraction from CAD images. Suitable open-weight models were benchmarked but required fine-tuning on synthetically generated image–text data and extensive performance evaluation. These workloads are computationally intensive, making large-scale HPC indispensable.
The Solution
SSC developed a generative AI solution that automates the extraction of metadata and structured information from 2D CAD drawings and related documents. To meet strict data protection requirements, the solution is designed as a local AI deployment.
A configurable synthetic data generation pipeline was created to produce large volumes of training data that reflect real-world CAD formats. Vision–language models were fine-tuned using this synthetic data and evaluated on existing drawings. Training and benchmarking in the first phase required approximately 6,500 GPU node hours.
The AI model is intended to be integrated into SSC’s established data exchange software, SWAN. This integration will allow engineers to query CAD drawings via an interactive chatbot, significantly reducing manual effort when sending, receiving, and reviewing engineering data.
Impact
The innovation study extends SSC’s core offering into AI-driven automation, enabling entry into new markets and strengthening its position in the automotive sector. They have identified significant efficiencies during testing across 2,600 CAD drawings, which showed a 92% reduction in time for table header classification compared to manual processes. The solution improves productivity, reduces errors, and supports digitization of engineering workflows.
Consequently, by automating repetitive review tasks, engineers can focus on higher-value activities such as design and collaboration. The project also strengthened AI expertise within SSC, contributing to workforce upskilling.
Finally, more efficient data handling and reduced manual rework contribute to lower resource consumption in engineering processes, supporting more sustainable industrial workflows.
Benefits
- Strategic expansion of SSC’s portfolio into AI-driven automation for engineering.
- 92% reduction in time required to extract key metadata from CAD drawings.
- 96% reduction in manual effort compared to traditional review processes.
- Improved data quality through automated metadata validation against PDM system.
- Knowledge development, with 3–4 SSC employees gaining advanced AI and HPC expertise.
- Solution easily adaptable to other industries and document-intensive use cases.