In [9]:

```
import warnings
warnings.filterwarnings("ignore")
import datetime
import pandas as pd
# import pandas.io.data
import numpy as np
import matplotlib
from matplotlib import pyplot as plt
import sys
import sompylib.sompy as SOM# from pandas import Series, DataFrame
pd.__version__
%matplotlib inline
```

“Having the Answers to the Known Questions”

“Learning to Ask Good Questions”

**Independece of the objects****Notion of abstract object**

**representing a house based on its features****A student based on its grade****A country based on its GDP**

**Dependence of the objects to each other**

**A word in a text****A pixel of an image****A chemical element in a molecule****A specific ingredient in a food receipie****A building in its neighborhood****A Person in a social network**

In [10]:

```
N = 500
x1= np.random.normal(loc=17,scale=5,size=N)[:,np.newaxis]
x2= np.random.normal(loc=0,scale=5,size=N)[:,np.newaxis]
y = 3*x1 + np.random.normal(loc=.0, scale=.4, size=N)[:,np.newaxis]
x1 = np.sort(x1)
# x1 = np.random.uniform(size=N)[:,np.newaxis]
# y = np.sin(2*np.pi*x1**3)**3 + .1*np.random.randn(*x1.shape)
# x1= np.random.normal(loc=17,scale=5,size=N)[:,np.newaxis]
x1 = np.random.rand(N) * 10 - 5
x1 = np.sort(x1)
x1 = x1[:,np.newaxis]
noise = 0.1
# y =-.1*x1**3 + 2*x1*x1 + 2*np.sqrt(x1)+ 10*np.random.normal(loc=30.0, scale=4.7, size=len(x1))[:,np.newaxis]
def f(x):
x = x.ravel()
return np.exp(-x ** 2) + 1. * np.exp(-(x - 1) ** 2)
y = f(x1) + np.random.normal(0.0, noise, N)
y = y[:,np.newaxis]
# x1= np.random.normal(loc=17,scale=5,size=N)
# x1 = np.sort(x1)
# # x1 = x1[:,np.newaxis]
# def f(x):
# x = x.ravel()
# return -.1*x1**3 + 2*x1*x1
# y =f(x1) + 10*np.random.normal(loc=30.0, scale=4.7, size=len(x1))
# y = y[:,np.newaxis]
# print x1.shape, y.shape
def polynomial_regr(degree=1):
from sklearn.preprocessing import PolynomialFeatures
from sklearn import linear_model
X_tr = x1[:].astype(float)
y_tr = y[:].astype(float)
poly = PolynomialFeatures(degree=degree)
X_tr_ = poly.fit_transform(X_tr)
# X_ts_ = poly.fit_transform(X_ts)
regr = linear_model.LinearRegression()
regr.fit(X_tr_, y_tr)
y_pred_tr = regr.predict(X_tr_)[:]
# y_pred_ts = regr.predict(X_ts_)[:]
# Predicting the training data
plt.plot(X_tr,y_tr,'.b',markersize=6,alpha=.4 );
plt.plot(X_tr,y_pred_tr,'-r',markersize=10,alpha=1 );
```

In [11]:

```
from ipywidgets import interact, HTML, FloatSlider
interact(polynomial_regr,degree=(1,160,1));
```

In [12]:

```
# plt.xlim(2,2)
# plt.ylim(2,2)
for i in range(150):
# x = 4*np.random.rand()
x = 2 + np.random.randn()*.04
y = 2 + np.random.randn()*.04
# y = 4*np.random.rand()
plt.plot(x,y,'ow',markersize=16,alpha=.1);
plt.plot(x,y,'or',markersize=6,alpha=.05);
plt.plot(2,2,'ow',markersize=16);
plt.plot(2,2,'or',markersize=6);
# plt.axis('equal');
plt.axis('off');
plt.title("space of potential functions and the optimum function");
```

**central memory means that this abstract point can generate complete instance of the objects!**

In [5]:

```
for i in range(5,6,1):
for j in np.linspace(1,10,num=10):
plt.plot(i,j,'ow',markersize=16);
plt.plot(i,j,'or',markersize=6);
plt.xlim(0,10)
plt.ylim(0,11)
# plt.axis('equal');
plt.axis('off');
# plt.title("single storage point");
```

**By looking at different features of the objects****By focusing on different areas of the state space****Many powerful algorithms are in this category**

In [6]:

```
from matplotlib.patches import Rectangle
for i in np.linspace(1,10,num=5):
currentAxis = plt.gca()
cx = i-.6
cy = 1-.5
currentAxis.add_patch(Rectangle((cx,cy), 1.2, 10.1, fill=None, alpha=1))
for j in np.linspace(1,10,num=10):
cx = i-.6
cy = j-.5
ax = plt.axes()
plt.plot(i,j,'ow',markersize=16);
plt.plot(i,j,'or',markersize=6);
# ax.arrow(cx+1.2, cy+.5, .7, 0, head_width=0.15, head_length=.1, fc='k', ec='k')
plt.xlim(0,12)
plt.ylim(0,11)
plt.axis('off');
```

In [7]:

```
from matplotlib.patches import Rectangle
for i in np.linspace(1,10,num=5):
currentAxis = plt.gca()
cx = i-.6
cy = 1-.5
currentAxis.add_patch(Rectangle((cx,cy), 1.2, 10.1, fill=None, alpha=1))
for j in np.linspace(1,10,num=10):
cx = i-.6
cy = j-.5
ax = plt.axes()
if np.random.rand()>.3:
plt.plot(i,j,'ow',markersize=16);
plt.plot(i,j,'or',markersize=6);
# ax.arrow(cx+1.2, cy+.5, .7, 0, head_width=0.15, head_length=.1, fc='b', ec='b')
else:
plt.plot(i,j,'ow',markersize=16);
plt.plot(i,j,'og',markersize=6);
ax.arrow(cx+1.2, cy+.5, .7, 0, head_width=0.15, head_length=.1, fc='g', ec='g')
plt.xlim(0,12)
plt.ylim(0,11)
plt.axis('off');
plt.title("combinatorial representation of objects");
print "Red nodes and arrows are activated prototypes for a specific input."
```

**In addition, in this architecture, the models also learn a hierarchical representation of the objects, unlike the previous architectures that the representation of the data is fixed in advance!****This is of great importance in more complex applications where objects are composed of lower level features:****A floorplan as the composition of its elements"****A sentence as an ordered collection of words and ordered collection of charachters****A city as a composition of its road segements and building patterns****Biological Systems,..**

**In principle, Machine learning and Data are offering a universal way to solve (disolve) many real world problems, which are hard by classical expert based or theoretical approaches****In principle, from the view of ML different domain specific problems become similar to each other**

- Supply Chains and manufacturing systems
- Transportation Dynamics
- Air Pollution Modeling
**Water Flow Modeling**- Real Estate Market
- Urban Modeling
- Economic Networks
- Natural Language Modeling
- ...

The Basic Idea

- Urban Morphology is the study of urban forms and patterns
- But currently limitted to theoretical and abstract models, which are based on limitted observations
- In terms of ML, urban planners work with "A-priori" Rules

**What if we collect the data for thousands of cities and use ML to study them in a more data-driven way, then we can answer questions such as:****What are the clusters of emergent urban forms at the global scale?****What are the charachterisitics of each cluster?****Or predict quantitative features of cities (e.g. road pollution) by looking at their urban forms: streets, buildings, satellite images**

Data Collection

- Collecting images of street networks from OSM via styled maps from Mapbox
- It is also possible to get the geometric information of road networks and buildings. There around 150M digitial buildings in OSM format!

In [8]:

```
from IPython.display import YouTubeVideo
YouTubeVideo('QFF5IezOdaU',width=700, height=600)
```

Out[8]:

Main Elements

- How to
**compare**city maps? - To learn a
**dense representation**of each city

Main Techniques and Frameworks

**Convolutional Auto-Encoders to learn the dense representations of each city**We Used Tensorflow codes will be here

Som initial Results

- Using KNN on the learned dense vectors for each city

- Further, Self Organizing Maps for dimensionality reduction and visualization

Next Steps

The Basic Idea

- Initially: To see how it is easy to predict real estate property values,
**BUT THERE WERE NO DATA AVAILABLE!** - So, we started crawling publically available data in Switzerland and Germany + open geodata
- Prediction was easily possible with ML (94% accuracy (ARE) for rental price estimations in Switzerland).
- This took us more into this application

Main Elements

- Continuous Collection of online ads through web crawling
- Geo-Coding: Google API
- Collecting any other Open Source Data: OSM, Geoadmin.ch
- Applying ML (Unsupervised) for filling the empty fields
- Automated Evaluation Model on a server
- Interactive web application

Main Techniques and Frameworks

- Self Organizing Maps for multidimensional probablistic models
- Ensemble models such as Random Forests (Scikit-learn)
- Flask as web framework
- Leaflet for mapping application
- Mapbox Layers
- D3 for interactive visualizations