top of page

ADITYA MAHESHWARI

backnew_edited.jpg

Delhi: OpenAI and Urban Identity is a data-driven exploration of how artificial intelligence can redefine our understanding of cities. This thesis merges urban design theory with emerging technologies to assess the imageability of Delhi using Kevin Lynch’s five elements—paths, edges, districts, nodes, and landmarks. By leveraging OpenAI’s natural language processing and vision models, along with crowdsourced data from platforms like Google Reviews, TomTom traffic data, and NASA night light imagery, the project captures how people perceive and experience Delhi in real time. A custom-built KNIME workflow automates data collection, processing, and classification, assigning identity scores to different urban components. The model, trained using data from 20+ global cities, achieves 85% accuracy in predicting urban identity patterns. This work not only visualizes Delhi's identity across zones but also proposes revitalization strategies, policy recommendations, and a framework for real-time AI-based urban monitoring. Through this portfolio, the thesis unfolds as a dynamic story of how AI and urban planning intersect—bridging perception, technology, and spatial reality to reimagine the future of our cities.

Thesis - Delhi : OpenAI and Urban Identity

Introduction

AIM and Objectives of the Thesis

AIM

To assess the urban identity of Delhi using Kevin Lynch’s framework, supported by AI-driven analysis of crowdsourced data.

OBJECTIVES

  1. To identify and map Lynch’s five urban elements through crowdsourced digital data.

  2. To apply AI and NLP techniques for analyzing qualitative urban perceptions.

  3. To develop a predictive model for urban identity using ANN and CNN architectures.

  4. To validate the AI-based approach against traditional urban analysis methods.

  5. To propose data-informed interventions for urban structure, policy, and real-time monitoring in Delhi.

Aim & Objectives
image.png

Urban identity is the unique character or personality of a city, shaped by its history, culture, architecture, and physical environment. It's how residents and visitors perceive and recognize the city, often tied to what makes it distinct.

When we define urban identity based on existing academic knowledge, we can say that it encapsulates the intricate interplay between individuals and their surrounding environments, wherein each exerts formative influences upon the other’s identity.

image.png

A broader array of distinctive elements—both palpable and abstract—situates this dynamic relationship that confers uniqueness upon an urban locale, shaping its distinctive sense of place. Urban identity is an intricate concept that reflects a city’s or urban area’s distinct physical, social, and cultural characteristics [1]. The built environment, economic structures, cultural practices, and social interactions all play a role in shaping it.

Source - A Comprehensive Methodological Approach for the Assessment of Urban Identity

HISTORY OF DELHI

Sir Thomas Hoderness, Permanent Secretary at Indian Office persuaded him to share the commission with Herbert Baker (Lutyen’s old architect friend)

● Town planning Committeewas formed which 2 architect and John A. Broodie as engineer and S.C. Swinton as municipal expert.

● Further Baker designed Secretariat, Parliament House and Bungalows.

● Lutyen did City Planningand designed Viceroy House(Rashtrapati Bhawan),Princely Houses and India Gate.

History

click to zoom

click to zoom

MASTER PLANS IN DELHI

❖ Master plan Delhi , 1962 focussed on following key areas:

● To control Irregular and unplanned growth of Delhi

● Delhi should be planned in the context Of its region and Decentralisation of employment

● Areas that have healthy organic pattern must be conserved

Masterplans in Delhi

click to zoom

click to zoom

ANALYSIS OF PHYSICAN AND SOCIAL INFRASTRUCTURE 

Screenshot 2025-03-11 042201_edited.jpg
MPD 2001

click to zoom

click to zoom

MPD 2021

click to zoom

The journey towards the Delhi Master Plan 2021 involved several key milestones over the years. Planning for the master plan formally began in 2002, the same year the Delhi Metro was inaugurated. Following this, the Yamuna Action Plan (Phase II) was initiated in 2004. In 2005, the Jawaharlal Nehru National Urban Renewal Mission (JNNURM) was launched. Significant planning developments occurred in 2007 with the NDMC Sub City Plan and the delineation of the Lutyens Bungalow Zone (LBZ) by DUAC. The city hosted the Commonwealth Games in 2010. Subsequently, the Rajiv Awaas Yojana was introduced in 2011. By 2015, major national initiatives like the Smart City Mission and Housing for All were launched. The year 2016 saw multiple developments, including the Jaha Jhopdi Wahin Makaan Yojana for slum rehabilitation, the introduction of Unified Building Bye Laws, and Demonetization. Finally, in 2018, the River Rejuvenation Committee for Delhi was established.

ANALYSIS OF PHYSICAN AND SOCIAL INFRASTRUCTURE 

MPD 2041

The Delhi Master Plan 2041 is guided by the vision to "Foster a Sustainable, Liveable and Vibrant Delhi." This vision is supported by three core goals: firstly, to become an environmentally sustainable and healthy city adaptable to climate change; secondly, to develop into a future-ready city providing quality, affordable, safe living alongside efficient mobility systems; and thirdly, to emerge as a dynamic hub for economic, creative, and cultural development. The plan acknowledges Delhi's demographic significance, noting it holds 1.39% of India's population with high literacy (86.2%), an 11 million workforce, and a large youth segment, with 97% urban residency. It also includes five-year growth projections. Environmentally, a key achievement highlighted is the near doubling of green cover between 2001 and 2017. Heritage conservation strategies involve creating and regularly updating an asset inventory and offering incentives for preserving and repurposing heritage buildings. The transport approach is multi-pronged, aiming to improve connectivity and infrastructure, encourage a shift to shared mobility, make the city more walkable and cyclable, and implement effective parking management.

click to zoom

Methodology for Analysis

01

Understanding Delhi: Historical and Planning Context
The study begins with a review of Delhi’s historical evolution, key planning milestones, and spatial development patterns as outlined in the Master Plans of 1962, 2001, and 2041. This provides the contextual foundation for analyzing the city's identity.

02

Element Identification and Data Acquisition
Based on Kevin Lynch’s framework, five urban elements—paths, edges, districts, nodes, and landmarks—are spatially delineated. OpenAI tools are used to extract basic metadata (names, coordinates), while additional data is collected from Google Reviews, TomTom, Wikipedia, TripAdvisor, and NASA nightlight imagery.

03

Physical Surveys and Data Validation
Field visits are conducted to collect real-world observations, user reviews, and photographs across various landmarks and urban elements. This primary data is used to cross-validate and refine the digital data collected from online platforms, ensuring ground truth reliability.

04

Data Processing and Feature Engineering
Data is processed through KNIME workflows for sentiment analysis (NLP), visual feature extraction (OpenAI Vision), spatial clustering, and parameter normalization. Each element is assessed using a set of defined metrics relevant to its urban function.

05

Model Development and Training
A hybrid CNN-ANN model is trained using data from 80+ global cities to classify urban identity performance. The model evaluates urban elements and generates a composite identity score for each zone.

06

Application and Visualization
The trained model is applied to Delhi, producing detailed spatial outputs that highlight strong and weak identity zones. Results are visualized through an interactive dashboard for planning, monitoring, and proposal formulation.

Anchor 1
Anchor 2

Data Collection

This section outlines the diverse and multi-dimensional datasets used to analyze the urban identity of Delhi, mapped through Kevin Lynch’s five elements. All data collection processes were executed through automated and scalable workflows in KNIME, making the model repeatable and easily extendable to other cities. Using this setup, data was collected consistently for over 20 global cities to train the predictive AI model.

Google Reviews & Ratings

Collected using the Google Places API integrated into KNIME via the GET Request node. Data such as review text, user ratings, and timestamps were extracted for landmarks, public spaces, and other urban elements to capture public sentiment and popularity.

Wikipedia Content

Data on the historical and cultural significance of places was extracted through the Wikipedia API, automated within KNIME. The textual data was later used in sentiment and narrative analysis.

OSM (OpenStreetMap) Data

Open-source spatial data on green spaces, built-up areas, and public infrastructure was downloaded and parsed using the OSM Web Feature Service (WFS) through KNIME’s geospatial extension nodes.

TomTom Traffic Data

Real-time data on congestion levels and speed delays was extracted via TomTom’s public traffic API, automated within KNIME workflows. This data was essential for evaluating the performance and navigability of major city paths.

Google Street View Images

Street View images were collected using the Google Street View Static API, coordinated via batch processing in KNIME. These images served as inputs for visual analysis in the OpenAI Vision model.

Traffic Department Crash Data

Accident and crash data reports were obtained from the Delhi Traffic Police's public datasets. These were imported into KNIME using File Reader nodes and geocoded for spatial correlation with paths.

TripAdvisor Ratings

TripAdvisor data was extracted through web scraping techniques using KNIME’s XPath and HTML Parser nodes, enabling the capture of tourism-related public opinions on landmarks and districts.

NASA Nightlight Imagery (Black Marble)

Nightlight raster images were downloaded from NASA’s Earth Data repository and processed using KNIME’s Raster Processing and ImageJ integration, representing economic activity and spatial vitality.

Field Surveys & Observations

Primary data including photographs, user interviews, and physical observations were manually recorded and fed into KNIME’s data table for comparison and cross-validation with crowdsourced data.

Scalability & Repeatability

All data workflows were designed to be modular and reusable, allowing the same structure to be applied across different cities. This enabled the successful collection and processing of data for over 20 cities, forming the training base for the hybrid AI model used to predict urban identity.

KNIME Model & Data Processing

This section presents the data processing methodology used to analyze and rate elements of urban identity — landmarks, districts, and pathways — through a replicable and scalable pipeline built entirely on the KNIME Analytics Platform. The model integrates diverse data sources (APIs, satellite imagery, crowd-sourced content, and computer vision outputs) and processes them into usable scores for evaluation.Each dataset undergoes automated processing workflows that ensure consistency across 20+ Indian cities. The workflows are modular and allow for easy updates or scaling, making the system adaptable for future research or urban applications.

KNIME model

Google Ratings

Definition: A numerical score out of 5, based on user ratings on Google Maps.

Relevance: Reflects the landmark's popularity and perceived quality. High ratings

Sentiment Analysis (Google/Twitter/Reddit):

Definition: A score (typically -1 to 1) derived from natural language processing (NLP) of textual reviews

Relevance: Provides qualitative insight into public perception, revealing nuanced feelings beyond numerical ratings.

Proximity Number

Definition: The count of other landmarks within a 1.2km radius of a given landmark, indicating clustering.

Relevance: Reflects the cultural or historical density of an area, which enhances a landmark's significance in urban identity.

Landmark and Nodes Data Processing parameters

Parameters​

KNIME MODEL

KNIME model for Processing Landmark data

District-Level Processing parameters

Number of Landmarks:

Reflects the district’s cultural and historical significance, aligning with Lynch’s landmarks as memorable features.

TripAdvisor Rating:

Provides a crowdsourced measure of visitor satisfaction, reflecting the district’s tourism appeal and public perception, a key aspect of urban identity.

Number of Nodes:

Captures focal points (e.g., transit hubs, squares) where activities converge, supporting Lynch’s nodes as strategic orientation points.

Wikipedia Sentiment (Historical Importance):

Gauges the district’s historical narrative through text analysis, capturing its cultural depth and resonance, which aligns with Lynch’s emphasis on mental mapping based on historical landmarks.

Green Space Area:

Represents environmental quality and recreational value, contributing to a district’s livability and perceptual appeal, which ties into Lynch’s imageability through natural edges or districts.

Mean of Night Light Image (NASA Black Marble):

Measures economic activity and urban vitality through satellite data, offering an objective proxy for a district’s development level.

Pathway-Level Processing

Speed Delay and Congestion Level

These TomTom-derived metrics measure the functional efficiency of pathways, reflecting how easily people navigate them

Road Straightness and Directional Approach

These AI-vision-derived metrics evaluate navigational clarity.

Presence of Landmarks

Identifies notable features along paths (e.g., India Gate along Rajpath), reinforcing their memorability and cultural significance

OpenAI Vision Model

Utilizes street view images to quantify road straightness, landmark presence, and directional complexity.

Sentiment Analysis (Wikipedia):

Assesses the historical importance of a path through text analysis, capturing its cultural narrative

Landmark and Nodes Data Processing parameters

Parameters​

Screenshot 2025-03-11 102559.png

KNIME model for Downloading and Processing google street view Images data

Machine Learning Model Training

To translate complex urban experiences into measurable insights, a supervised Artificial Neural Network (ANN) model was trained on data from over 20 cities worldwide. This dataset includes cities with varying scales, histories, cultures, and spatial forms—ensuring that the model is not biased toward any one typology or planning style.

The model classifies each urban element—as Excellent, Average, or Needs Improvement based on their contribution to urban imageability, in line with Kevin Lynch’s framework.Before training, data from various sources—such as Google Reviews, TomTom traffic data, Street View images, Wikipedia content, OSM layers, and satellite imagery—was processed and converted into structured numerical parameters. These include sentiment scores, landmark density, road complexity, historical importance, and environmental quality.The entire training process was executed within the KNIME Analytics Platform, allowing seamless integration of visual, spatial, and textual data. KNIME’s modular, no-code environment made the model both scalable and repeatable, with the flexibility to adapt the same workflow to any other city.

Why KNIME for Model Training?

All training was done in KNIME, which offered a seamless, scalable, and visual platform for machine learning. Key advantages included:

 End-to-End Workflow: From data import and preprocessing to model training and evaluation—all in a single interface

Python Integration: Use of TensorFlow and Keras inside KNIME via Conda environment nodes 

Live Monitoring: Real-time feedback on training accuracy, confusion matrix, and performance metrics

Modular Reusability: The model can be easily applied to other cities or datasets with minimal changes

Benchmarking: Easy comparison of ANN performance against alternative models like Random Forest or SVM

Screenshot 2025-03-11 064051_edited.jpg

Artificial Neural Network (ANN) for processing predicting Urban Identity Scores

Screenshot 2025-03-11 102807.png

Machine Learning (ML) Model for traning street view images. 

Maps (as outcome of uraban identity score of KNIME model) of cities used as Traning data for the ANN model, sample shows 12/20 cities uses for traning

Prediction Results for Delhi

Using the trained ANN model, urban identity scores were predicted for various elements across Delhi, based on the parameters aligned with Kevin Lynch’s framework. Each lelement was classified as Excellent, Average, or Needs Improvement, reflecting its contribution to the city’s imageability.

This data-driven assessment provides a fresh lens to understand how different parts of Delhi are perceived and remembered, supporting more informed planning, design, and revitalization strategies.

Map of Delhi showing landmarks and nodes plotted with scoring as predected by KNIME model

Map of Delhi showing landmarks and nodes plotted with scoring as predected by KNIME model

Prediction results
image.png
image.png
image.png

Insights

Improved Correlation: The trendline shows a stronger positive correlation (estimated R² ≈ 0.75 vs. 0.72 originally), reflecting better alignment with the 94% within-one-category agreement.Reduced Discrepancies: Only Red Fort remains a significant outlier (0.9046, 2.33), likely due to survey emphasis on crowding not fully captured by static sentiment.

Success Demonstration: The 94% agreement rate proves the methodology's success when weights are optimized to reflect survey priorities (ratings and sentiment over proximity).

image.png
Map of delhi showing Districts with scoring as predicted by KNIME model

Map of delhi showing Districts with scoring as predicted by KNIME model

image.png
image.png
NLT.png
image.png

NLT - Night Light Imagery of delhi with mean values used as parameter for districts

Map of Delhi showing Major Pathways with scoring as predicted by KNIME model

Map of Delhi showing Major Pathways with scoring as predicted by KNIME model

corelationheatmap.jpg
2.jpg

Sample of 10 / 7,340 street view images downloaded using KNIME model pathway analysis

image.png

Sample of results of Image segmentation of urban elements  from KNIME model

Thesis Praposals
(under progress, visite again to see updates on praposals)

Anchor 3

Following the in-depth analysis of Delhi’s urban identity using AI and data-driven techniques, the thesis culminates in a set of targeted proposals aimed at strengthening the city's imageability, livability, and planning responsiveness.These proposals are rooted in the spatial diagnostics derived from the model’s predictions and align with the five urban elements outlined by Kevin Lynch.

01

Urban Structure Plan for Delhi

A citywide strategy to enhance imageability through improved spatial continuity, stronger visual anchors, and integration of AI-generated identity zones into the urban design framework.

02

Zone-Specific Revitalization Strategies

Detailed proposals for low-scoring areas, focusing on strengthening local landmarks, improving connectivity, and enhancing environmental quality through targeted interventions.

03

Policy & Institutional Recommendations

Development of planning guidelines, zoning revisions, and an institutional framework to incorporate real-time perception data and AI tools into decision-making processes.

04

Area-Based Detailed Proposals

Micro-level design proposals for selected neighborhoods and corridors, informed by both AI-based predictions and field observations. These detail interventions at the street and block scale, including façade treatments, signage, public space activation, and green networks.

05

Urban Identity Dashboard

A proposed web-based dashboard developed using KNIME, enabling planners to view identity scores for landmarks, districts, and pathways in real-time. This dashboard serves as a planning support system for continuous monitoring, analysis, and benchmarking of urban identity indicators.

The KNIME-based analytical models and dashboard tools developed in this research are original contributions.These workflows were designed from scratch to evaluate urban identity using structured and unstructured data, with scalability across multiple cities.I am currently in the process of filing a patent for the methodology and model architecture developed during this thesis.                                                                                                                                 

bottom of page