Using AI to Find the Best Value Car: A Practical Guide

Are you in the market for a new car? I am, and I rely on online car listing aggregators to find my next perfect family hauler. With hundreds of listings, how do you find the best value? As a web programmer, I can’t resist performing some basic web scraping and running “AI-based” data analysis. AI based data analysis to help me pick the best value car quickly and efficiently. Using this simple approach, I can sift through large amounts of data and identify the best deals that might be missed by casual browsing.

Why I Use Cars.com for Car Shopping

I think that Cars.com is a great listing aggregator. It is a very useful tool because it narrows down my selection by make, model, and year. My goal is to get family minivan ideally, Toyota Sienna of 2021 or later. Why? Because minivans are the ultimate family cars, perfect for long road trips. First, let’s go to Cars.com and get hundreds of local listings in seconds!

Optimizing Your Search on Cars.com

Use the available filters to narrow down your search by make, model, year, price, or whatever is important to you. By default, Cars.com shows only 20 results per search, and there is no user-visible control to adjust that. Fear not, you can manually increase that limit in the URL. Just find the relevant segment in the URL and change it to its maximum of 100, and yes 101 will default to 20 so it won’t work, like so: ...&page_size=20... to ...&page_size=100... :

Get Basic Listings Info as CSV

With more listings per page, we have a better overview of the market. Now, in your web browser, inspect the page, open the console and run this JavaScript snippet:

var rows = [
  '"Used/New", "Year Make Model Trim", "Mileage", "Price", "Dealership Name"'
];
var columns = ['.stock-type', '.title', '.mileage', '.primary-price', '.dealer-name'];
document.querySelectorAll(".vehicle-card").forEach(e => {
  var row = [];
  columns.forEach(c => {
    var str = "";
    try {
        str = e.querySelector(c).textContent.replace(/(?:\r\n|\r|\n)/g, '');
    } catch(e) {}
    row.push('"' + str.trim() + '"');
  });
  rows.push(row.join(","));
});

console.log(rows.join("\n"));

This snippet will help you retrieve all the listings on a single search result page as CSV, making it possible for ChatGPT-4o to analyze. Only: xls, xlsx, csv, pdf and JSON formats are supported. Note that this snippet contains some hardcoded element selectors, which may change with the next site update! Hopefully, these elements will stay the same for a while, but that is not guaranteed. Now, when we have basic info in CSV format, let’s ask AI to analyze this data for us! Here is a video demonstration of the steps discussed up to this point:

Utilizing ChatGPT-4o for Car Data Analysis

Large language models are great at interpreting textual data, including data in CSV format. In this exercise, I will use ChatGPT-4o which has data analysis capability on user uploaded files. How data analysis works with ChatGPT?

To start, upload one or more data files, and ChatGPT will analyze your data by writing and running Python code on your behalf. It can handle a range of data tasks, like merging and cleaning large datasets, creating charts, and uncovering insights.

OpenIA, accessed 26 June 2024, Improvements to data analysis in ChatGPT, <https://openai.com/index/improvements-to-data-analysis-in-chatgpt>

Running Python code on your behalf is a really impressive and powerful feature. AI not only has to prepare the data for processing, but also design an algorithm by which each car can be compared then it has to write the code itself then it has interpret the compilation results and then make additional decisions and do all of this again until satisfying result has been achieved. ChatGPT-4o does all of it plus it explains its reasoning. It works pretty great but I wonder where its limitations are. As of the time of writing this blog post, the limitations are: 10 files per conversation and each file has a limit of ~50MB. Let’s use following prompt and put ChatGPT to work:

I have a CSV file containing basic information about various cars. The columns in the CSV are: Used/New, Year Make Model Trim, Mileage, Price, Dealership Name.
I want to determine the best value car from this list. The best value can be considered as the car that offers the best balance between the price, mileage and year. Explore the data and come up with optimal algorithm. Group the results by the "Year Make Model Trim" column. 

Here is an example of the CSV data:

""
"Used","2021 Toyota Sienna XLE 7 Passenger","49,352 mi.","$41,476","Gurley Leep Volkswagen"
"Used","2023 Toyota Sienna XLE","38,016 mi.","$40,955","Hertz Car Sales Des Plaines"
"Used","2023 Toyota Sienna LE","27,660 mi.","$37,994","Wilde Toyota"

Please provide a detailed analysis including which car offers the best value and any insights or patterns you can identify from the data. Moreover, identify dealers which I should visit first in order to optimize my car-hunting time starting from the one that offers best value cars or largest selection.

Here is the video of the chat in action:

Conclusion

This analysis is primarily based on two features: year and mileage, with price as the target variable. Due to randomness in the LLM, the results of this analysis may vary on each try. Therefore, it is vital to review your results and the methodology chosen by the chat. In professional analysis, you would want to include many more features such as additional equipment and color. In reality, people also consider the actual condition of the car and how it drives. In conclusion, don’t use this approach as the definitive truth about available cars but as an additional tool to assist in your decision-making process.