The transportation sector, and in particularly intelligent transportation systems, generate large volumes of real-time data that needs to be managed, communicated, interpreted, aggregated, and analyzed. To this end, innovative big data processing and mining as well as optimization techniques, need to be developed and applied in order to support real-time decision-making capabilities. Towards this end, this paper presents an ETL (extract, transform and load) architecture for intelligent transportation systems, addressing an application scenario on dynamic toll charging for highways. The ETL approach presented here, is responsible for preparing the data to be used by traffic prediction services, which will dynamically affect toll prices within different contexts. The proposed architecture relies on the adoption of “big data” technologies, to process and store large volumes of data from heterogeneous sources, provided by different highway operators. The proposed architecture is capable of handling real-time and historical data using big data technologies such as Spark on Hadoop and MongoDB. The DATEX-II data model is adopted, in order to harmonize traffic data provided by the highway operators. The work presented here, is still part of ongoing work currently addressed under the EU H2020 OPTIMUM project. Preliminary results achieved so far do not address the final conclusions of the project, but enabled us to demonstrate considerable gains in performance, when compared to other traditional ETL approaches, and also form the basis for pointing out and discuss future work directions and opportunities in the area of the development of big data processing and mining methods under the ITS domain.