You are on page 1of 7

BE A REALIST: Uncover The Truth in Singapore Private

Property Market
Fu Yi, Orkhan Hasanli, Tan Yong Ying Joanne

Abstract— The Singapore government closely observes the property market and occassionally implements new policies as cooling
measures to prevent the market from heating up too quickly. Even though several government and non-government organizations
already created visualization tools to explain the property market in Singapore, they are unable to reveal more information from the
property market which has a complicated nature. Thus, there is still much potential in using new tools to advance the understanding
and visualizations of changes in the market.
Through the integration of R packages, our application will help users to discover patterns and compare differences between
property prices in different administrative areas over time. Firstly, we used plotly to chart the comparison between total units sold
and SIBOR (Singapore Interbank Offered Rate) which serves as a main factor of fluctuation in number of units sold. Secondly, by
creating the geofacet map for Singapore, we visualised changes in median unit price over time from the perspective of planning
areas and postal districts. Thirdly, by coordinating the views between two visualizations, we used a treemap as a user interface to
update the ridgelines plot which zooms into the distribution of prices in a specific region by property type and type of sale. Lastly, we
used the Local Indicators for Spatial Autocorrelation (LISA) analysis to reveal clusters of properties in Singapore by their median
unit price, and incorporated the results into an interactive map using the tmap package. For the ridgelines plot and LISA analysis
mentioned above, all the private property types such as apartment, condominium, executive condominium, detached house, semi-
detached house, terrace house as well as different type of sales such as new sale, resale and sub-sale were provided for users to
drill down into.
Index Terms— SIBOR (Singapore Interbank Offered Rate), Geofacet, Ridgelines, Local Indicators for Spatial Autocorrelation
(LISA), Data Visualization, Geo-spatial Temporal Analysis.

M OTIV ATION
In Singapore, as well as many other nations of the world, housing yet again, and we are interested to see if this cooling measure will
markets are characterised by the co-existence of a freely priced part have a strong impact on the Singapore property market.
of the market with a part that is subject to varying degrees and forms
of government intervention and regulation. The Singapore housing O B JE C TIVE
market has an especially complex institutional structure with its large
regulated public housing sub-sectors. The private housing price are Private property prices in Singapore cannot be easily analysed by
affected by the standard determinants of supply and demand as well recourse either through analysis of private-sector supply and demand
as by many government policies. However, the extent of the effects or simple trend-line forecasting. Therefore, it is imperative to analyse
varies across locations, time, property type, type of sales and is prices with a more thorough approach. Currently, there is no tool
influenced by factors like SIBOR (Singapore Interbank Offered which can provide an interactive and unbiased visualization for
Rate). property market. Most of the published data are still presented in
Property transactions in Singapore require buyers to pay BSD static tables and the accompanying visualizations are quite basic.
(Buyers Stamp Duty) for documents executed for the sale and Different types of trends are illustrated statically and graphs are not
purchase of property. Liable buyers are required to pay ABSD explanatory enough to show the full picture of the dataset, which
(Additional Buyers Stamp Duty) on top of the existing BSD. ABSD prevents readers from getting any useful insights and findings.
and BSD are computed on the purchase price as stated in the dutiable Our objectives for this project can thus be summarized into the
document or the market value of the property. following points:
Over the last twenty to thirty years, the property prices had 1. Visualize the relationship between units sold and SIBOR on a
witnessed a roller coaster of changes. The changes follow changes in yearly, quarterly and monthly basis.
SIBOR and introductions of new government policies. In the most 2. Provide visualizations that illustrate price trends across all
recent decade, the economy of Singapore had rebounded back after administrative areas in Singapore
the 2008 financial crisis. As a result, Singapore private market prices 3. Provide an interactive and coordinated tool which shows the
had heat up significantly since 2009, and they only started cooling density of the total number of units sold in locations, together with
down when the Singapore government announced that the property the drilled-down distribution of median unit prices for a specified
tax rates will be made more progressive over two years from January area.
2014. After a four-year slump until the end of 2017, the property 4. Reveal clusters of properties by their median unit price using
market started to bounce back and the prices had shown an the Local Indicators of Spatial Autocorrelation (LISA) analysis.
increasing trend. On the other hand, the US Federal Reserve’s 5. Present all the above visualizations into a scalable, portable
interest-rate had increased five times ever since President Trump and easy-to-use web application through the integration of R
took office in Jan 2017, and there will be 2 more increases happening packages and the usage of R Shiny framework.
in 2018. These news had already made substantial impact to
Singapore’s overall economy; when Federal Reserve Interest Rate R EV IE W AN D C RI TI QU E OF P AS T W OR KS
increases, the Singapore Interbank Offered Rate (SIBOR) will
increase accordingly. IRAS (Inland Revenue Authority of Singapore) 1.1 Private Property Index by URA
made an announcement in February 2018 that the property tax would The Urban Redevelopment Authority provides a quarterly price
increase to 4% for properties with house price over 1 million SGD. index for private residential property which is presented as a static
Furthermore, on 6th July 2018, the ABSD had increased drastically graph that starts from the first quarter of 1993. The drawback of the
plot provided by URA is that the index uses the 2009 Q1 figure as D ATASE T AN D D ATA P REP AR ATI O N
the base to display price changes over the quarters. Users do not have
the options to change the base year or looking at monthly or yearly 1.3 Attribute Data
changes. The main dataset of interest is the data of all private property
transactions in Singapore provided by REALIS. We downloaded all
the available data from years 1995 to 2018 through the SMU Library
database.
Datasets on SIBOR were downloaded from the Monetary
Authority of Singapore website
(https://secure.mas.gov.sg/dir/domesticinterestrates.aspx). We
obtained the Interbank 1-Month rates on a monthly frequency
between January 1995 and August 2018.

1.4 Geo-spatial Data


To obtain the projection system to be used for our projects, we
downloaded the Singapopre Region Boundary shapefile from
Data.gov.sg (https://data.gov.sg/dataset/master-plan-2014-region-
Fig. 1. URA’s Singapore private presidential property price index. boundary-no-sea).

1.2 Property Market Cycle Model 1.5 Data Preparation


This plot shows the upward trend for property transaction volume. It Because we are interested to analyze prices from a broader
also points out every “Early Bull” and “Early bear” for investors to perspective, we used the dplyr package to aggregate the transactions.
draw their own insights from historical data on when is the right time In the end, we obtained a new aggregated dataset which contains the
to enter the market. The aesthetics of the plot is good too, as it median unit price of all transactions for each project on a monthly
includes the property price index as the reference line. However, the basis. dplyr provides many intuitive functions that allow us to select
main drawback of this plot is the lack of interactivity, as it does not variables, filter unwanted rows and aggregate measures using the
allow users to specify the property type or type of sale to dig deeper summarize function.
into. It also lacks the ability for users to observe trends on a more For the LISA analysis, we need to plot the projects as spatial
granular time interval. points, thus geocoding is required to obtain the x and y coordinates
of projects. From the aggregated dataset, I selected the project names
and postal codes to be geocoded using Kun Sheng’s code
(https://github.com/tankunsheng/SgPostalToLatLng) which
leverages on the SLA OneMap REST API.
At the end, the aggregated dataset is used for most visualizations
in our application, while the LISA dataset is used in the LISA
analysis.

1.6 Geofacet Grid Preparation


One of the plots created using the geofacet package requires a row-
by-column grid of the locations in any geography of interest. During
this project, there were no existing grids for Singapore’s planning
areas and postal districts. Therefore, we used the Geo Grid Designer
(https://hafen.github.io/grid-designer/) to manually create the grids
Fig. 2. Property Market Cycle Model. based on a reference image, save them as data frames, then use them
with the facet_grid function to create facet plots that are
A N ALYTICS A PP ROAC H representative of the geography.
To provide the ideal interactive visualization application that we
have in mind, we explored different system design and aesthetic
principles learnt in our data visualization journey. These will be
described and highlighted in detail in the following sub-sections.
This is a summary of our application creation process:

Fig. 4. Screenshot of our Singapore Planning Area grid in Geo Grid


Designer
Fig. 3. The group’s approach to the project as a process.
Fig. 7. Number of units sold versus Period Average SIBOR on a yearly
basis.
Fig. 5. Reference map of Singapore Planning Areas used to create
the grid in Fig. 4. Monthly level of aggregation was done by summing up the
monthly number of units and taking 1-month SIBOR. For the
V IS U AL D ES IGN F R AM EW OR K quarterly and yearly level of aggregation the total sum per quarter
and year was calculated by taking the average of the 1-month SIBOR
1.7 Visual Structure over three months and twelve months respectively.
We applied a Top-Down methodology to establish a logical flow for 1.9 Geofacet Country Overview
users to explore the data from a high-level Singapore overview to
Geofacet is a geoscience solution that enables seamless search and
deeper drill-down into the regions, areas and projects.
integration of georeferenced maps. The geofacet map is created
using the geofacet package in R, and it extends from ggplot2 in a
way that makes it easy to create geographically faceted
visualizations. To geofacet is to take data representing different
geographic entities and apply a visualization method to the data for
each entity, with the resulting set of visualizations being laid out in a
grid that mimics the original geographic topology as closely as
possible.

Fig. 6. Hierarchical view of the level of drill-down in our application. Fig. 7. Geofacet plot of median unit prices from 1995 to 2018, facetted
by Planning Area
1.8 Line Plot of SIBOR versus Units Sold
Plotly package was used to display the relationship between period
average SIBOR and total number of units sold. As these two
variables have different scales, and to avoid confusion on which
chart refers to which variable, we used dual Y-axis and differentiated
the different plots using consistent colour scheme between the charts
and the axes. The hover elements are also in the same colours as
their respective graphs, making the visualization more informative
and easier for users to gain insights. Time is plotted on the X-axis,
and users can select monthly, quarterly or yearly level of aggregation
for the variables.

Fig. 8. Geofacet plot of median unit prices from 1995 to 2018, facetted
by Postal District

The advantages of applying geofacet map is that:


1. It is flexible as it provides the capability to plot multiple
variables within each geographic entity.
2. It gives details about the subsurface structure, properties, 1.11 Ridgeline Plots for Distribution of Prices over Time
composition and its evolution. We used ggridges packages to plot the ridgelines graph. The idea is
3. It saves time in searching for and georeferencing the map for to display the density of median unit prices across different variable
exploration purposes. levels on a consistent horizontal scale, while displaying the
We provided a user interface for users to switch between an additional variable information on the Y-axis, the horizontal facets
adaptive Y-axis and a fixed Y-axis, after which the geofacet plot will and the vertical facets.
change accordingly. This helps users to uncover different insights
which will be elaborated in Section 7.

1.10 Treemap to Reveal Market Structure


After getting an overview of price movements in Singapore using
geofacet, we move on to look at the structure of the market using
treemap visualization, as it is good for visualizing hierarchical data.
In this context, the hierarchy is defined as: Singapore -> Planning
Region -> Planning Area -> Postal District.
First, the treemap package was used to generate a static treemap
which we stored as an object. Then, we pass this static treemap to Fig. 11. Ridgeline plot of median unit prices in Central Region.
d3tree2 function to create an interactive treemap which users can
click on to navigate between the different hierarchies of the property The colour we use is “aquarium” with 0.5 opacity such that the
market. overlapping areas can easily be detected. As the ridgeline plot is an
extension package of ggplot, we used the facet_grid function to
segment the data by Property type and Type of Sale to have a better
comparison.

The advantages of using ridgeline plots are:


1. It allows users to view distributions of continuous variables,
while not losing view of other variables.
2. It allows users to detect sub-markets within regions or
markets. In our case, substantial outliers form a second hump which
characterize a sub-market in that particular property type.
3. It informs users how median unit price has changed over time,
as the peak for each hump may shift across the years.
4. It updates dynamically as it is linked with the treemap for the
Fig. 9. Overall Singapore Treemap. The block size represents the selected location.
number of units sold in the area.

1.12 Local Indicators for Spatial Autocorrelation (LISA)


Typically, we compare property prices based on their absolute values
and we can only comment on whether the prices are “higher” or
“lower” than one another, but beyond that we do not know other
information about how their prices are relative to other properties
based on their location and we do not get an objective view of how
the property’s price compares with other properties in their
surroundings or in the whole nation. Therefore, we performed a
LISA analysis for median unit prices of projects and incorporated the
results into our visualizations using interactive maps. The LISA
analysis aims to reveal clusters of spatial features based on their
attributes. In our project, we aim to identify clusters of projects based
Fig. 10. Central Region Treemap. The block size represents the
on their median unit price and identify the “hot spots” and “cold
number of units sold in the area.
spots”.
The sidebar user interface allows users to choose if they want the First, we used the coordinates function from the sp package to
size of the rectangles to represent the total number of units sold in create a SpatialPointsDataFrame (SPDF) of projects based on their
the area, or the total number of projects with transactions in the area. project names and x and y coordinates. Then, we remove projects
However, the treemap by itself does not reveal a lot of that have duplicated locations before passing the projection system
information about the regions other than their relative median unit of the Singapore Planning Region shapefile to the new SPDF using
price through the block’s color, and the number of units through the proj4string function.
block’s size. Therefore, we decided to use the treemap as a user With the new SpatialPointsDataFrame, we calculated a Local
interface which is linked to a ridgeline visualization which will be Moran’s I statistic for each project using the localmoran function in
explained below. For example, if I decide to click on the Central the spdep package. The underlying equation for each project i is as
Region in the Singapore treemap, not only will the treemap zoom in follows:
to display Central Region information, but there will also be a
ridgeline plot below which gives detailed breakdown of transactions
in the area.
where xi refers to the median unit price of project i, X̅ is the average
median unit price in Singapore, wi,j is the spatial weight between
project i and its neighbours project j, and:
Based on the point pattern distribution of the raw dataset, we ran
1000 Monte Carlo simulations under the assumption that there is no
spatial pattern in the prices of private projects. After getting each
project’s Local Moran’s I statistic, we classify the projects into five
categories depending on their statistic and median unit prices relative
to Singapore’s values:
1. Insignificant: Even though I have a local Moran’s I statistic, it
is not statistically significant at the p-value of 0.1.
2. Low-low: If my price is lower than the national average and
my neighbours’ prices are as low or lower, I am a low-low point. Fig. 13. Interpretation of Local Moran’s I Scatterplot
3. Low-high: If my price is lower than the national average but
my neighbours’ prices are relatively higher than me, I am a low We also provided a sidebar interface for users to filter the
outlier among high points. property type and years they are interested in. Users can choose to
4. High-low: If my price is higher than the national average but compare the prices of any two years using the snapshot view, or they
my neighbours’ prices are not as high as me, I am a high outlier can choose the time-series view to compare prices over four
among low points. consecutive years.
5. High-high: If my price is higher than the national average and They can also tweak the values of parameters for calculating the
my neighbour’s prices are as high or high than me, I am a high-high LISA statistic. If users choose the fixed bandwidth, projects will be
point. compared with neighbours within a chosen radius from themselves.
Then, the points are plotted on an interactive map using the tmap If users choose the adaptive bandwidth, projects will be compared
package. We chose an interactive map because it allows curious with the k nearest neighbours which may exist in circles of different
users to zoom-in and explore the different projects in Singapore and bandwidths for different projects.
compare their locations and prices, which is much more flexible than
a static map which merely describes the differences from an overall
view and nothing more.

Fig. 14. User interface of Hotspot map tab

Fig. 11. Singapore map at highest zoom level

Fig. 12. Drill-down into Orchard area at lower zoom level

Human beings tend to associate red colours with hotter objects


and blue colours with cooler objects. Therefore, we used a diverging
colour scale to represent the four cluster from cool spots to hot spots.

Fig. 15. Toggle between snapshot view versus time-interval view


example, in the hotspot map section, I wanted to use the tm_facet
function in the tm_package to show the changes in clusters over time
using multiple facets, and make use of the “sync” argument to allow
users to have coordinated navigation across all four plots. Eventually
we discovered the leaflet output can be shown in RStudio but not in
the R Shiny application. We were disappointed because it was a
function we loved, but we eventually overcame it by plotting four
separate maps. Similar to Orkhan’s experience, there are times where
we have to explore multiple packages and ways of doing things
before we settle for one way that produces the desired output. In any
development work, it is always an iterative process where we have to
go through multiple changes before arriving at the final product. The
team must be patient and strong enough to navigate the lows together
so that we can experience the satisfaction of completing the
Fig. 16. Toggle between fixed and adaptive bandwidth
milestones at the end.
FU YI: When plotting ridgeline plot, I planned to plot continuous
V IS U ALIZATION I N SI GH TS variables on Y axis, however ridgeline can only recognize discrete
From the dual Y-axis plot, we might see an inverse relationship variable on Y axis and continuous variable on X axis.
between SIBOR and total units sold. In general, when the SIBOR As for project, it did not go smoothly at early stage. Before
rate increases, the number of units sold decreases. There might be deciding to settle down with Property Market Watch topic, we were
some exceptions to the inverse relationship where various external discussing about visualizing logistic and supply chain for an e-
factors such as cooling measures or economic crises happened in the commerce (QVC) dataset. After we sat and looked closer the QVC
market - Asia Pacific Currency Crisis (1998), The United States dataset for 1 month, we didn’t manage to start our project, because
subprime mortgage crisis (2008) - making the relationship not follow there are data quality issues and the data structure was not clear to
the same pattern and these are easily detected from the plot. visualize from any angel, we could not find meaningful objective to
By looking at the geofacet plot it is obvious that the central execute it.
region has an obvious increasing trend in prices compared to the After a university lesson, we were inspired by a hierarchical
rural area, and their absolute prices are also higher than those in the dataset of REALIS data, immediately, we consulted Prof Kam about
rural area. When we apply the adaptive Y-axis to the plot, we our new thoughts, brainstormed some enlightening ideas with him.
discover new insights previously not discoverable in the fixed Y-axis By that day, our project started operating like a rocket, trying out
plot. The trend of prices in the rural areas become much more different visualization methods, some of them were denied and some
obvious, and we can see that their prices are also as volatile as those made significant improvement. MY KEY TAKEAWAY IS THAT, WE
in the central region, just that their absolute values are lower. SHOULD never just sit and imagine, should always try out to see
From the treemap, we see that the central region has the most whether it is applicable.
units sold between 2010 and 2017, and Orchard is the planning area
in the region with the highest median unit price. When we zoom into I NS TALLATI O N AN D U S ER G U IDE
Orchard and look at the ridgelines plot, we see the market is
The application can be accessed at
extremely niche as it consists of mostly apartments and
https://joannetyy.shinyapps.io/SGPrivateProperty/. To run the
condominiums only. Resale apartments in Orchard have an
application on local machines, users are required to install a list of R
extremely wide spread of median unit prices compared to
packages inside their R/RStudio environment. The list of packages
condominiums and compared to most price distributions seen in
are provided in Annex A.
other parts of Singapore.
A C KN OW LED GM EN TS
F U TU RE W OR K
We sincerely thank our Prof Kam Tin Seong for his consistent
For our application we have concentrated only on private property
support and guidance in teaching us different visualization tools
market and narrowed down the number of variables for our
throughout the ISSS608 course, and his valuable advice given during
visualization. For the future work, we might include HDB property
the many iterations which ultimately produced the final application.
to have a clear picture of factors that affect private and public
properties. As a transaction record in our application we are looking
as a median price, however, including transaction records during R E FE RE NC ES
specific time frame for properties would be more informative to see [1] Carson Sievert (n.d.). Plotly for R. Retrieved from https://plotly-
the changes in the price of specific properties. In order to provide book.cpsievert.me/index.html
comprehensive analysis iit would be preferable to use variables such [2] Claus O. Wilke (04 April 2018). Introduction to ggridges. Retrieved from
as completion status whether the project is completed or still in a https://cran.r-
planning stage. project.org/web/packages/ggridges/vignettes/introduction.html
[3] Ryan Hafen (2018). Geofacet. Retrieved from
L E ARN IN G E XPE RIEN CES https://hafen.github.io/geofacet/
[4] Government Policies and Private Housing Prices in Singapore. Published
Orkhan: During plotting the dual Y-axis we have faced a challenge
in Urban Studies, Volume 34, Issue 11, November 1997, Pages 1819-
in bringing two variables to the same scale. First, we tried to make 1829. http://journals.sagepub.com/doi/10.1080/0042098975268
use of ggplot features to map SIBOR and Unit Sold in two Y-axis, [5] Kam Tin Seong. Hands-on Exercise 10: Geographically Referenced
however, ggplot requires to manually scale one of the variables to Attribute Analysis [PDF file]. Obtained during consultations with Prof.
the same scale with another. By using plotly package there is no need Kam.
to scale two variables to the same base, therefore, we managed to [6] Kam Tin Seong. Lesson 9: Treemap Visualization with R [html file].
overcome that issue. Obtained during ISSS608 course taught by Prof. Kam.
Joanne: As R and R packages are all open-source, there were [7] Monetary Authority of Singapore (n.d.). Domestic Interest Rates.
times when we got “surprises” while using them in our project. For Retrieved from https://secure.mas.gov.sg/dir/domesticinterestrates.aspx
[8] Singapore Ministry of Finance (n.d.). Stamp Duties. Retrieved from A N NEX A
https://www.mof.gov.sg/Policies/Tax-Policies/Stamp-Duties
[9] Timothy Kua (2012, October 30). 1 Month Vs. 3 Month SIBO. Which is
better? [Blog post]. Retrieved from https://blog.moneysmart.sg/home-
loans/1-month-vs-3-month-sibor-which-is-better/
[10] Urban Redevelopment Authority (2018, July 2). URA releases flash
estimate of 2nd quarter 2018 Private Residential Property Price Index.
Retrieved from https://www.ura.gov.sg/Corporate/Media-Room/Media-
Releases/pr18-40

A N NEX B