Executive Summary
A telecommunication company in Indonesia manages large volumes of data originating from multiple sources and formats, including spreadsheets, presentations, databases, and unstructured documents. The lack of standardization across these datasets creates challenges in data accessibility, analysis, and timely insight generation. This condition particularly affects operational use cases such as Customer Complaint Handling, where rapid identification of issues and trends is critical.
To establish a scalable data foundation, the company develops the ITOPS Data Hub to transform disparate data into a unified, structured, and analytics-ready format. The platform leverages Amazon Web Services (AWS) to enable automated data ingestion, transformation, and centralized storage, supporting seamless consumption by Amazon Athena and Amazon QuickSight for querying and visualization.
Within this foundation, the Customer Complaint Handling use case utilizes Generative AI and OCR capabilities to enrich and classify complaint data extracted from raw notes and supporting documents. The solution performs automated attribute extraction, text embedding, clustering, and standardization to convert unstructured inputs into meaningful, tabular insights. This enables faster root-cause identification, improved trend visibility, and more efficient operational response.
The solution is designed to support both structured and selected unstructured formats, with text extraction powered by Amazon Textract and specialized document parsers where required. Processed data is stored in Amazon S3 and made query-ready in Amazon Athena, providing a scalable and governed analytics environment.
Success is measured by the model’s ability to achieve at least 95% accuracy in mapping and classifying complaint data according to predefined standards. Through the ITOPS Data Hub initiative, the company strengthens its data-driven operational capabilities, improves complaint handling effectiveness, and builds a reusable foundation for future analytics and AI-driven use cases.
Challenge
The company faces challenges in the internet service purchasing process, where customers often encounter issues during transactions. In many cases, customers successfully purchase an internet package but are unable to access the service due to technical problems. The 24×7 support team is available to handle these complaints, but they face limitations in efficiently identifying the root cause of the issue.
One of the biggest challenges is the lack of an automated system capable of categorizing and analyzing customer complaints. Currently, the support team does not have access to a system that can monitor complaint trends in real time, making it difficult to determine whether recurring issues are caused by third-party service integration failures or internal technical problems. As a result, the support team must handle complaints manually and on a case-by-case basis, which slows down issue resolution and reduces efficiency.
Solution
To address the company’s challenges in managing and analyzing customer complaints, a data-driven, AI-powered solution is proposed, leveraging AWS services. This solution, aligned with the Gen AI ITOps Data Hub design, will automate complaint summarization and categorization, and provide real-time insights. The core of this solution is to transform disparate data sources into a cohesive, tabular format, enabling efficient data analysis and visualization.
The solution will encompass the following:
- Data Ingestion and Storage:
- AWS Glue will extract raw customer complaint data from the company’s ticketing database. This process will be designed to handle various data formats, including those outlined in the Gen AI ITOps Data Hub document, such as SQL, PostgreSQL, CSV, TSV, XLSX, PDF, and DOCX. This ensures that all relevant data, regardless of its original format, can be incorporated into the analysis.
- The extracted data will be stored in Amazon S3, forming a data lake. The data lake architecture will adhere to the best practices outlined in the Gen AI ITOps Data Hub document, including considerations for scalability, cost-efficiency, and security. This includes defining a robust bucket strategy, data organization, metadata management (using AWS Glue Catalog), data quality measures, and data lifecycle management. Special attention will be paid to the data lake architecture and blueprint for the company, emphasizing scalability, cost-efficiency, and security as defined in the Gen AI ITOps Data Hub document.
- Generative AI-Powered Complaint Processing:
- A backend system deployed on Amazon ECS will process the raw data from S3.
- This system will interact with AWS Bedrock and potentially Amazon Textract:
- AWS Bedrock’s Large Language Models (LLMs) will be used to summarize customer complaints and categorize them into predefined groups (e.g., Activation Issues, Billing Problems, Network Coverage). The LLM model used will be chosen based on accuracy and cost considerations, similar to the Claude-3 Sonnet model mentioned in the Gen AI ITOps Data Hub document. Token usage will be carefully monitored and optimized, as detailed in the Gen AI ITOps Data Hub document. The success criteria for the LLM model is achieving 95% accuracy. This accuracy represents the percentage of correct mappings produced by the LLM for each complaint data, aligning with the expected category or resolution based on predefined standards.
- Amazon Textract will be used to extract text from any image-based complaints, enabling the LLM to process a wider range of input.
- Real-Time Data Querying and Analytics:
- AWS Glue Data Catalog will manage the metadata of the processed data in Amazon S3.
- Amazon Athena will enable SQL-based querying of the structured complaint data.
- Dashboard for Support Team Monitoring:
- Amazon QuickSight, with Generative BI, will provide a real-time dashboard for the support team. This dashboard will offer:
- Complaint volume by category
- Geographical distribution of issues
- Trending issues over time
- AI-powered insights via natural language queries
- Scalability and High Availability:
- The architecture will be designed for scalability and high availability, leveraging multiple AWS Availability Zones (AZs) as described in the Gen AI ITOps Data Hub document. This will ensure business continuity and minimize downtime.
- Data Lake Architecture:
- The solution will implement a data lake architecture on AWS, following the blueprint provided in the Gen AI ITOps Data Hub document. This will include:
- Data Lake Storage (S3): With a defined bucket approach, compression, partitioning, and data encryption strategy.
- Ingestion: Supporting batch and near real-time data ingestion.
- Data Catalog (Glue Catalog): For metadata management, orchestration, and logging/monitoring.
- Data Processing (Glue, EMR): For batch and serverless data processing.
- Data Consumption: Defining consumption methods, data permissions, and data access controls.
- Data Visualization and Analysis: Using Amazon QuickSight and Amazon Q.

The solution architecture is designed to be highly scalable, available, and secure, leveraging the following AWS services:
- Data Lake: Amazon S3 will serve as the foundation for the data lake, providing scalable and cost-effective storage for all complaint data. The data lake architecture will adhere to the principles outlined in the Gen AI ITOps Data Hub document, ensuring a well-organized and easily accessible data repository.
- Data Ingestion: AWS Glue will be used for Extract, Transform, and Load (ETL) operations. It will extract data from various sources, transform it into a consistent tabular format, and load it into Amazon S3. This process will handle structured data formats (SQL, PostgreSQL, CSV, TSV, XLSX) and unstructured formats (.PDF, .DOCX).
- AI Processing:
- A backend system, likely deployed on Amazon ECS (Elastic Container Service), will host the application logic for processing the complaint data. This system will use LLMs from Amazon Bedrock to summarize and categorize complaints. The system will also use Amazon Textract to extract text from images.
- Data Catalog and Querying: AWS Glue Data Catalog will provide a centralized metadata repository, enabling efficient querying and analysis. Amazon Athena will allow support teams to use SQL to query the processed data in Amazon S3.
- Visualization: Amazon QuickSight will provide interactive dashboards and visualizations, enabling real-time monitoring of complaint trends and key metrics. Amazon QuickSight with Generative BI will provide AI-generated insights.
- Orchestration and Automation: AWS Step Functions can be used to orchestrate the data processing pipeline, ensuring that all steps are executed in the correct order and that errors are handled appropriately.
- Infrastructure as Code: The entire infrastructure will be defined and managed using AWS CloudFormation, enabling consistent deployments and infrastructure automation.
- Security: Security will be a key consideration throughout the architecture. Services like AWS IAM, AWS Key Management Service (KMS), AWS GuardDuty, and AWS Security Hub will be used to ensure data security, access control, and threat detection.
- Monitoring and Logging: Amazon CloudWatch will be used to monitor the performance and health of the system, providing logs, metrics, and alerts.
Implementation
The implementation of this solution will involve the following key steps:
- Data Lake Setup: Configure Amazon S3 buckets, define the data organization and partitioning strategy, and set up appropriate access controls. This will follow the data lake architecture and blueprint as defined in the Gen AI ITOps Data Hub document.
- ETL Pipeline Development: Develop AWS Glue jobs to extract, transform, and load data from the various source systems into the data lake.
- AI Model Integration: Integrate with Amazon Bedrock to access the required LLMs. Develop the backend system on Amazon ECS to process data using these LLMs and Amazon Textract.
- Data Catalog and Querying Setup: Configure the AWS Glue Data Catalog and set up Amazon Athena to enable data querying.
- Dashboard Development: Create interactive dashboards in Amazon QuickSight to visualize complaint data and provide real-time insights.
- Testing and Deployment: Thoroughly test the solution in a development environment before deploying it to production. Use AWS CloudFormation to automate the deployment process.
- Training and Documentation: Provide training to the support team on how to use the new system. Create comprehensive documentation of the solution.
Outcome
The implementation of the Gen AI ITOps Data Hub for Customer Complaint Handling establishes a measurable improvement in the company’s ability to process, classify, and analyze customer complaints. The solution converts previously unstructured complaint data into structured, analytics-ready information that enables operational teams to identify trends, root causes, and service issues more efficiently.
Structured Complaint Data Transformation
- The system converts previously free-text customer complaints into 19 standardized analytical parameters (e.g., product, buying channel, payment channel, issue category, service type, and other operational attributes).
- Approximately 100% of processed complaint records are transformed into structured tabular data, enabling SQL-based analysis and dashboard visualization.
High-Accuracy Automated Classification
- Generative AI models deployed via Amazon Bedrock perform automated complaint summarization and categorization.
- The solution is designed to achieve ≥95% classification accuracy when mapping complaint descriptions to predefined parameters.
Improved Complaint Trend Visibility
- With structured data available for analytics, operational teams can analyze complaints by product, buying channel, payment channel, and issue category.
- The time required to identify complaint trends is expected to be reduced by approximately 60–70% compared to manual analysis.
Operational Efficiency Improvement
- Automated classification significantly reduces the need for manual complaint review and categorization.
- The solution is expected to reduce manual analysis workload by approximately 50–60%.
Faster Root Cause Identification
- Near real-time dashboards built using Amazon QuickSight enable faster monitoring of complaint patterns and operational issues.
- The time required to identify recurring issues and potential root causes is expected to decrease by approximately 30–40%.
Data-Driven Monitoring and Analytics
- Operational dashboards provide visibility into key metrics such as:
- Complaint volume by product
- Complaint distribution by buying channel
- Complaint distribution by payment channel
- Complaint trends over time
- This enables data-driven operational decision making and faster service improvement actions.
Scalable Analytics Foundation
- The data lake architecture built on Amazon S3 with query capability via Amazon Athena supports scalable analytics for millions of complaint records annually.
- The platform also provides a reusable foundation for future AI and advanced analytics use cases.

