Clastify logo
Clastify logo
Exam prep
Exemplars
Review
HOT
We just launched question banks, notes & flashcards: biology, chemistry, physics
Background

Math AI IA Research Question Generator

Use the tabs below to generate a new Math AI IA idea or evaluate your current research question.

0/5 used

Sample Math AI IA Topic Ideas

Browse these sample topics to get inspired, or scroll up to generate your own custom ideas based on your specific interests.

Medium

Investigating how linear and polynomial regression models compare in predicting monthly electricity consumption (kWh) for 100 households in Madrid during 2018–2020 based on temperature (°C) and household size (number of occupants).
Suggested Approach

Start by framing the research question exactly as given and plan a clear, realistic dataset workflow. Describe where you will get the monthly electricity consumption, temperature and household size data (city or municipal open data, national statistics, or a simulated but realistic sample) and explain any ethical or privacy considerations. Clean the data first: check for missing months, inconsistent household IDs and outliers; justify any removal or imputation choices. Create an exploratory section with descriptive statistics and visualizations (time series plots for consumption, scatterplots of consumption vs temperature, boxplots by household size and seasonal month) to show patterns and motivate the choice of predictors. State assumptions explicitly (linearity, independence, homoscedasticity, absence of multicollinearity) and note any domain assumptions you make about heating/cooling behaviour in Madrid between 2018–2020. Link these preparations directly to your research question so the reader sees why each decision matters for model comparison.

In the analysis, fit both linear regression and polynomial regression models using temperature and household size as main predictors, and consider interaction terms or higher-degree temperature terms only if exploratory plots suggest non-linearity. Split the data properly (for example by households or by time) and use cross-validation to compare predictive performance robustly; avoid using the same months for training and testing across households without justification. Calculate and report clear, comparable metrics: RMSE and MAE for prediction accuracy, R^2 for explained variance, and model-selection criteria such as AIC or BIC to penalize complexity. Examine residuals, leverage and influence plots to check model assumptions; if heteroscedasticity or non-normality appears, present transformations or robust regression as justified alternatives and show the derivation or the equations you used. Include sample calculations and show how coefficients were obtained (normal equations or software output), but also interpret coefficients in the context of kWh per °C or per occupant so the examiner sees practical meaning.

When writing, follow the IA format: concise introduction with rationale and aims, detailed main body showing step-by-step mathematics and clear figures, and a conclusion that restates the research question and summarises which model predicts better and why. In the evaluation, be honest about limitations (dataset size, omitted variables like appliance efficiency or income, temporal correlation) and propose specific, realistic extensions (additional predictors, seasonality models, mixed-effects models by household). Cite all data sources and any statistical methods texts you used, keep the whole essay within the IA page limit, and ensure mathematical exposition is clear enough that an examiner can follow every step without guessing.

Read more


Relevant Exemplars
View 100+
Investigating the correlation between CO2 emissions and the adoption of renewable energy in The United States, Brazil and Italy through the use of mathematical models and tools

Hard

Modelling the optimal allocation of study time between mathematics and language subjects for a cohort of 60 IB Diploma students to maximize average predicted exam scores using constrained optimization and quadratic programming.
Suggested Approach

Start by framing the research question explicitly at the top of your introduction and explain why constrained optimization and quadratic programming are appropriate tools for this investigation. Define your decision variables (for example, average weekly study hours for mathematics and for language subjects per student or per cohort subgroup) and state clear, realistic constraints (total available study hours, curriculum minimum requirements, and any individual limits). Collect or construct a dataset representing 60 IB Diploma students: this can be real anonymized data from your school, a carefully justified synthetic dataset based on published study–score relationships, or a mixture where you estimate parameters from small samples. Explain and justify every assumption you make (linear vs. nonlinear effects of study time, diminishing returns modeled as quadratic terms, homogeneity within subgroups), and include any background theory on quadratic programming and convex optimization so the reader can follow why your model is solvable and how the objective function represents average predicted exam scores.

In the main analysis, derive your objective function step by step and show algebraic work that leads to the quadratic form (e.g., average score = baseline + a·t_math + b·t_lang − c·t_math^2 − d·t_lang^2 − e·t_math·t_lang if modeling interaction). State the constraint equations in matrix form and show how they translate into the standard quadratic programming (QP) formulation min (1/2)x^T Q x + c^T x subject to Ax ≤ b and equality constraints if needed. Use software to solve the QP (suggestions: Excel Solver, Python with CVXOPT/quadprog/scipy.optimize, or R’s quadprog), include the code or Solver setup as an appendix, and present sample calculations for at least one student or subgroup so the examiner can follow your method. Include clear graphs and tables: contour plots of the objective, feasible region, sensitivity analyses showing how optimal allocation changes if parameters vary, and a comparison of predicted average scores before and after optimization for the cohort of 60.

Conclude by restating how your analysis answers the research question and reflecting critically on strengths, limitations and possible extensions. Discuss model validity (how well the quadratic form captures diminishing returns and interactions), data limitations (sampling bias, measurement error in study time), and practical concerns about implementing your recommendations in real school timetables. Suggest concrete improvements that maintain the original research question (for example, fitting better parameter estimates, including more subject categories or heterogeneous student types) and be explicit about which parts of your write-up belong in the introduction, main body, appendix (data, code, extra derivations) and references so your IA meets the 12–20 page requirement while showing rigorous mathematics and clear communication.

Read more


Medium

Analysing the relationship between shoe size and sprint time by fitting and comparing exponential and power-law models to a dataset of 200 high-school sprinters aged 15–18 in Ontario, Canada.
Suggested Approach

Start by describing your dataset and planning your method: state the research question exactly and summarise the sample (200 sprinters, ages 15–18, Ontario). Record how and when sprint times were measured and whether shoe sizes use a standard scale (e.g. US/UK/European) — convert to one consistent system and anonymize personal identifiers. Create a clean data table and show sample calculations of mean, standard deviation and any outliers you detect; justify any removals. Note possible confounding variables (age, sex, height, training level) and explain whether you will control for them by stratification, including them in multivariable models, or explicitly treating the analysis as exploratory. State clearly any assumptions you make about measurement error and the causal interpretation (for example you are modelling association, not proving causation). Keep this description concise in your introduction and list the objectives: fit an exponential model and a power-law model, compare their fits, and evaluate which model better describes the relationship between shoe size and sprint time.

In the main analysis, show step-by-step model fitting and all mathematical work so an examiner can follow your reasoning. Explain the functional forms you are fitting: y = a e^{b x} for the exponential and y = c x^{d} for the power law (use consistent notation), then show how to linearize them (taking logs) to use linear regression, derive parameter formulas or describe your software method, and include at least one worked example calculation. Produce and label scatter plots of raw data, overlay fitted curves, and include residual plots and tests for heteroscedasticity and normality of residuals. Compare models using numerical criteria (adjusted R^2, RMSE, AIC/BIC if available) and graphical diagnostics; explain what each metric means in plain language and why a model with better predictive accuracy might still be inappropriate if residuals show systematic patterns. If you control for confounders, present and interpret the adjusted coefficients and comment on how and why the relationship changes.

Conclude by restating whether the analysis answers the research question and summarising the main numerical findings with clear units (e.g. effect sizes interpreted as seconds per shoe-size or percent change per unit). Provide a careful evaluation: discuss limitations (sample representativeness, measurement error, age range, biological plausibility), ethical considerations and anonymization, and suggest realistic improvements or extensions (larger, more balanced sample, repeated measurements, or including height/leg length). Finish with concise practical takeaways for the reader and ensure your IA follows the required structure (cover page, contents, introduction, detailed math work, graphs, conclusion, evaluation, references) and stays within the 12–20 page guideline while showing full derivations, sample calculations and clearly labelled figures and tables.

Read more


Hard

Predicting daily bicycle-sharing demand at a specific docking station in central Amsterdam over one month by constructing and evaluating a time-series ARIMA model using historical hourly usage and weather variables (temperature, precipitation).
Suggested Approach

Start by clarifying the research question in one concise sentence at the top of your introduction and explain why predicting daily bicycle-sharing demand at this specific docking station in central Amsterdam over one month matters (for operations, planning or safety). Describe the data sources you will use: historical hourly usage for that station and matched hourly weather data (temperature, precipitation). State clearly any assumptions you must make (e.g., station identifier stays consistent, weather measured at a single nearby station, missing hours treated as zero or interpolated) and justify them briefly. Give a short background on time-series modelling and ARIMA: define stationarity, autoregressive (AR), integrated (I) and moving average (MA) components, and why ARIMA can capture temporal autocorrelation in demand; mention that exogenous regressors (weather) can be included via ARIMAX if you choose to model weather effects explicitly. Keep this section focused and limited to the theory you will apply, not exhaustive textbook derivations.

For the main analysis, document every data step in chronological order so an examiner can follow your work. Show how you preprocess: aggregate hourly counts to daily totals for the target month or explain why you retain hourly resolution and forecast daily sums from hourly predictions; handle missing data and outliers with clear methods and show sample calculations. Use plots (time series, ACF, PACF, decomposition) to justify choices of differencing and AR/MA orders; include the actual ACF/PACF plots and interpret them briefly. If you include weather variables, demonstrate correlation analysis (scatterplots, cross-correlation) and explain lag choices. Fit at least one ARIMA/ARIMAX model, show the fitted equations and parameter estimates, and compute diagnostic checks (residual plots, Ljung-Box test, normality). Compare candidate models using AIC/BIC and hold-out validation: reserve the last week (or a rolling cross-validation) and report forecast accuracy metrics (MAE, RMSE, MAPE) for daily predictions. Present example calculations of one forecast step to show your method.

In the conclusion and evaluation, restate the research question and summarise whether your model accurately predicted daily demand and which variables mattered. Interpret practical implications (e.g., expected peak days, sensitivity to precipitation) but avoid overclaiming beyond your data and model assumptions. Critically evaluate limitations: short forecast horizon, potential nonstationarity due to events, spatial factors, or data quality, and explain concrete improvements (longer historical data, additional exogenous variables, or alternative models like SARIMA or machine learning) as possible extensions. Finish by listing your data and code sources in a consistent reference style and include an appendix with full code, full tables, and extra plots so the examiner can reproduce your analysis.

Read more


Easy

Optimizing the dimensions of a cylindrical coffee capsule to minimize material surface area for a fixed volume of 5 cm³ by applying calculus-based optimization and comparing results with a discrete numerical search.
Suggested Approach

Begin by clearly restating the research question at the top of your introduction and explain why minimizing surface area for a fixed volume matters practically (material cost, sustainability). Briefly summarise the geometric model: a cylinder with radius r and height h, volume constraint V = πr²h = 5 cm³, and surface area S = 2πr² + 2πrh (or exclude one base if the capsule is sealed to one side—state which physical model you choose and justify it). List your assumptions explicitly (perfect cylinder, negligible wall thickness or constant thickness, no manufacturing constraints) and explain how they limit your conclusions. Keep this part concise but precise so the examiner sees you understand the context and the mathematical tools you will use (calculus optimization and discrete numerical search). State your aim: use calculus to find exact extrema, then validate and compare with a discrete numerical search that simulates realistic manufacturing steps or measurement error ranges.

In the main analysis, start by algebraically reducing variables using the volume constraint (express h in terms of r) and derive S(r) to obtain a single-variable function. Show differentiation steps clearly: compute S'(r), solve S'(r)=0, check S''(r) or use the first derivative test to confirm a minimum, and compute the corresponding r and h. Include units and a numerical check that V = 5 cm³ holds. Parallel to this analytic method, set up a discrete numerical search: choose a realistic range and step size for r (and compute h from volume), calculate S for each pair, and present results in a table and a plotted graph. Use software (Desmos, GeoGebra, Python, or a spreadsheet) and include screenshots or exported plots with captions. Discuss numerical resolution, step-size effects, and any small differences between continuous calculus results and discrete search outputs, commenting on rounding, measurement error, and manufacturability (standard capsule sizes).

Conclude by restating how your results answer the research question and reflect on reliability and applicability. Summarise the exact optimum from calculus and the closest achievable dimensions from the discrete search, noting any practical constraints that could shift the choice (e.g., integer-millimeter tooling). Evaluate limitations of your model (wall thickness, non-cylindrical features, sealing) and propose realistic extensions: include shell thickness, cost function combining material and manufacturing, or use optimization with constraints. Finish with a clear reference list of any formulas, software, and sources you used, and ensure the internal logic flows so an examiner can follow every step from model to conclusion.

Read more


Generate the Best Math AI IA Research Questions

Our AI quickly transforms your keywords into unique, high-quality research questions. The process is simple: Select your subject, enter a few keywords, or leave the field blank for instant inspiration. Click 'Generate' to start browsing ideas.

Master Your Coursework, Maximize Your Grade.

Gain unlimited AI topic generations & evaluations, unlimited access to all exemplars, examiner mark schemes, and more.