Lesson 1.2Module 1 · AI & ML Foundations12 min read

Training, data & bias

The model learned the world from a library it didn't choose — and that library was overwhelmingly Western. Here's how that accent shows up in your renders, and how to argue with it.

Ask for 'a beautiful home' and watch the model show you California.

Try it. Prompt almost any image AI for 'a beautiful modern home, photorealistic' and you'll likely get an open-plan glass box on a green American lawn, the lighting soft and Californian. Ask for 'a traditional house' and it leans European-medieval. The model is not biased against India out of malice. It learned what 'beautiful' and 'home' look like from billions of internet images that were captioned mostly in English and shot mostly in the West. It is showing you, faithfully, the average of what it was fed — and the diet was lopsided. For an Indian practice, that accent is the single most practical thing to understand about these tools.

The idea

Garbage in, gospel out — the data is the worldview

Step 01 — What they were trained on

A vast, scraped, English-leaning, Western-leaning library

Image models like Midjourney, Stable Diffusion and FLUX learned from hundreds of millions to billions of image-caption pairs scraped from the public web. Language models learned from a similar ocean of mostly-English text. Nobody hand-curated a balanced global syllabus; they took what was abundant and online.

And what's abundant online skews hard. English-language sources dominate. Western design media, stock photography and real-estate listings flood the set. So the model's idea of 'kitchen' is a Western kitchen with an island; its idea of 'living room' has a fireplace; its 'office' is a glass tower. The patterns it absorbed are its worldview, and that worldview has a postcode. As the saying goes: garbage in, garbage out — except here it's subtler and more dangerous, because the output looks polished and authoritative. Lopsided in, confident gospel out.

The training set is the worldview. Scraped from a web that is overwhelmingly English-captioned and Western-shot, the model's 'average home' has a postcode -- and it isn't in India.

Step 02 — How the bias shows up at your desk

Default to generic-Western, fumble the specifically Indian

The bias is rarely dramatic. It's a quiet pull toward a default. Ask for a 'modern villa' and you get something that could sit in Los Angeles or Dubai but not obviously in Pune. Indian specifics get thinner and shakier the more particular you go: a Vaastu-compliant plan, a pooja room, a jaali screen, a Chettinad courtyard, a Mangalore-tile roof, a wet-and-dry Indian kitchen — the model has seen far fewer of these, so it renders them generically, half-right, or stereotyped.

Watch for three failure modes. Erasure: the Indian element simply doesn't appear unless you fight for it. Stereotype: 'Indian interior' collapses into a clichd palette of saffron, brass and marigolds, ignoring the calm modern Indian home your client actually wants. Plausible-but-wrong: a jaali pattern that no mason ever cut, a 'traditional' detail that's an invented mash-up. None of these come with a warning label — they're rendered with the same confidence as everything else.

Three ways the bias surfaces. The Indian element vanishes (erasure), collapses into a saffron-and-brass clich (stereotype), or appears as a detail no mason ever built (plausible-but-wrong).

The model isn't refusing India. It's averaging a library where India was a rounding error. Your prompt is how you re-weight the average.

Step 03 — How to counter it in your prompts

Be specific, be local, be the curator the training set wasn't

You can't retrain the model, but you can steer it hard, because specificity overrides the default. The counter-bias toolkit is four moves.

Name the place and the vernacular precisely. Not 'traditional house' but 'Kerala nalukettu with sloping Mangalore-tile roof and central nadumuttam courtyard'. The richer and more correct your vocabulary, the better the model can find the right corner of latent space. Add real references — feed it a photo of the actual style (we cover img2img later) so it has a true anchor, not its averaged guess. Use negative prompts to push away the defaults it reaches for: 'no fireplace, no Western kitchen island'. And verify like a local — you are the cultural fact-checker the dataset lacked; if the jaali looks invented, it probably is.

This is the spine again: the machine diverges from a biased average; you converge toward what's true to this place, this climate, this client. India's vernacular is your edge, not the model's.

Read it your way

For the architect

Treat the model as a talented intern who trained entirely abroad and has never seen your region. It will quietly import Western massing, Western fenestration and Western climate assumptions — a vast west-facing glass facade it thinks is 'modern' but which bakes a Chennai living room. Counter it with precise regional and climatic language, and remember that NBC, local bye-laws and your hot-humid or composite climate logic live nowhere in its training. Vernacular fluency is a moat AI can't cross without you.

For the interior designer

Your daily fight is the stereotype trap. 'Indian interior' will hand you a postcard of brass and marigolds when your client wants a serene, contemporary Bengaluru apartment with a discreet pooja niche and a hardworking wet-and-dry kitchen. Steer with specifics — material names, real regional styles, the actual lifestyle — and feed reference images of the calm, modern Indian homes you mean. Studio Matrx's Style Explorer is built on India-aware sets precisely to dodge this default; use it as both a tool and a reference library.

For the student & solo studio

Bias is easiest to miss when you're solo and moving fast — the output looks great, so you ship it. Build a tiny habit: before you trust any 'cultural' render, ask 'would a local builder recognise this detail?' If the model gave you a generic glass box for a project in Jaipur, that's the bias talking, and your specificity is the fix. Knowing more about Indian vernacular than the model does is genuinely your competitive advantage as a small studio.

Tools and where their training bites (as of 2026)

tools date fast · verify

Midjourney v7 (v8 emerging)

Aesthetic image model — strong default, strong bias

Gorgeous output, but its 'beautiful home' default is the most Western of the lot. Superb once you steer it hard with vernacular vocabulary; left vague, it will quietly relocate your project to a green Western suburb.

Adobe Firefly

Image model on licensed training data

Trained on commercially-safe, Adobe-licensed and public-domain imagery, which Adobe indemnifies for commercial use. That curation helps with copyright but doesn't make it India-aware — the Western and generic skew is still there to fight.

Studio Matrx Style Explorer / Moodboards

India-aware curated style sets

Built on apartment-and-villa sets across Indian rooms and styles, so the defaults are tuned for local taste rather than a Western average. Use it as a counter-bias reference even when you generate elsewhere — but it's a curated library, not an open generator.

Common misconception

“The next, bigger model will be trained on more data, so the bias will simply disappear.”

More data mostly means more of the _same_ skew, because the abundant, English-captioned, Western-shot imagery just keeps growing fastest. Scale alone doesn't rebalance a lopsided library; it can entrench it. Models do improve at specific regions when deliberately fine-tuned, but you can't assume that — the safe stance is permanent: you steer with specificity and you verify the cultural detail yourself, however big the model gets.

Hands-on workshop

Free: any image AI. A second tab with reference photos of the real regional style helps.

Workshop — audit the bias, then beat it

You'll catch the model defaulting Western, then re-steer it to a genuinely Indian result on the same brief — proving that specificity, not a bigger model, is the cure. Twenty minutes.

Free: any image AI. A second tab with reference photos of the real regional style helps.

Copy & adapt

ROUND 1 -- the lazy prompt (watch it default):
"a beautiful modern home, photorealistic"
"a traditional house, photorealistic"
"an Indian interior, photorealistic"

ROUND 2 -- the steered prompt (same brief, specific):
"contemporary Bengaluru villa, exposed brick and
kota stone, deep verandah, jaali screen for shade,
flat RCC roof, tropical planting --no fireplace,
no snow, no Western kitchen island"

1Run all three Round 1 prompts. For each, note the giveaways: green Western lawn, fireplace, glass tower, saffron-and-brass clich. You're documenting the default worldview.
2Name which of the three failure modes you're seeing in each — erasure, stereotype, or plausible-but-wrong. Be specific about the detail that's off.
3Run the Round 2 steered prompt. Compare side by side with Round 1's 'modern home'. Mark what changed once you supplied real vernacular vocabulary and negative prompts.
4Stress-test one Indian specific the model struggles with — a Vaastu-correct entry, a proper wet-and-dry kitchen, a real jaali pattern. Judge it as a local: would a builder recognise this, or did the model invent it?
5Add a reference image of the true style if your tool allows, re-run, and note how much the anchor improves accuracy versus words alone.
6Write a reusable 'India steering block' — the three or four phrases and negatives that reliably pulled the model home. Save it for every future regional prompt.

You’ll walk away with
A before/after bias audit on one brief, plus a reusable 'India steering block' of vernacular phrases and negative prompts that drags any model away from its Western default.

Try it

Two more quick checks, if you have five minutes.

01Ask an LLM to 'list typical features of a modern home' and count how many are Western defaults (basement, fireplace, garage, drywall) versus Indian realities (overhead tank, puja space, balcony, RCC).
02Generate 'a doctor' and 'an architect' a few times and watch the demographic default. Same bias mechanism, different subject — it's worth seeing it's everywhere.

The idea to carry forward

A model's worldview is its training data, and that data was an English-leaning, Western-leaning library where Indian context was thin. So its defaults pull generic-Western and it fumbles Indian specifics through erasure, stereotype, or plausible-but-wrong detail. You can't retrain it, but specificity overrides the default — name the vernacular precisely, feed references, use negatives, and verify as the local fact-checker the dataset never had.

In one breath

Models learn the world from scraped, English-and-Western-heavy data, so 'beautiful home' defaults West and Indian specifics come out erased, stereotyped or invented. Counter it: precise regional vocabulary, real reference images, negative prompts against the defaults, and your own local verification. Bigger models don't fix this — your specificity does.

Make it real

Questions

Why does AI default to Western architecture and interiors?

Because it learned from billions of internet images and texts that were captioned mostly in English and shot mostly in the West. It has simply seen far more Western homes than Indian ones, so 'beautiful home' averages out to a Western default. It isn't malice — it's a lopsided library showing you its average, very confidently.

How do I get AI to produce genuinely Indian designs?

Be specific and local. Replace 'traditional house' with the real vernacular — 'Kerala nalukettu with central nadumuttam courtyard and Mangalore-tile roof'. Feed reference images of the actual style, use negative prompts to push away Western defaults like fireplaces and snow, and verify every cultural detail yourself. Specificity overrides the model's generic pull.

Will newer, bigger AI models fix this bias?

Not reliably. More data usually means more of the same skew, since Western, English-captioned imagery keeps growing fastest. Models can improve on specific regions when deliberately fine-tuned, but you can't assume it. The durable approach is to steer with specificity and verify the cultural detail yourself, no matter how advanced the model becomes.

If specificity is the cure for bias, then specificity is a skill worth mastering on its own. The next lesson treats prompting as exactly that: a craft, with an anatomy you can learn.

How AI models actually work Next: Prompting as a design skill