September 18, 2014
Fecal Map NYC: The Worst Places to Swim in the City

If you have ever tried to visit a NYC beach shortly after it rains heavily, you may be disappointed to find that beach closed.

The reason is one of every NYC environmentalist’s worst nightmares: Combined Sewer Outflows (CSOs).  Put simply, New York City’s sewage goes to the same place as its street drainage.  That works fine until we get so much rain that the sewage treatment plants can’t handle both the storm water and the sewage flowing through our sewers.  As a result, this combination of stormwater and sewage overflows and that resulting backup is released into our very own New York City waterways.  For the curious, check out this great page by the DEP which include descriptions of CSOs and maps of the outflows.  

So back to the beach— what causes it to close exactly?  Well, the city monitors its waterways for Fecal Coliform, something that is as gross as it sounds. Specifically, its a bacteria that grows in the intestines of warm blooded animals.  High level of fecal coliform indicates a high probability of raw sewage in the water.  If levels go above 1,000 coliform per 100ml of water, beaches are closed in accordance with state regulations

To find the dirtiest water in New York City (or at least the most sewage-full water, since there are many different ways to measure water quality), I turned to Harbor Water Sampling Data released as Open Data by the DEP. The dataset includes samples from dozens of sites back to 2008.

I explored the mean, minimum, median and max levels of fecal coliform at each site, but to decide which area was the dirtiest, I calculated the percent of days sampled at the site that registered as too dirty to swim in (i.e. above the safe level of 1000 coliform / 100ml).  

The Top 10 dirtiest water sample locations by that measure are below:


The dirtiest water?  Coney Island Creek, which sits between Coney Island and the rest of Brooklyn.   Not far behind it is Bergen Basin, near JFK. These two are at the top of the list by the mean measurement as well.  The Bronx River is number 3, Alley Creek is 4 and Bergen Basin comes back for number 5. At all five of these spots, samples came in as having too much fecal coliform to swim in more than half the time! So I mapped out these five “fecal hot spots” below: 


Spots 6 - 10 go to two sites in the Gowanus Canal, Flushing Creek and another site in both The Bronx River and Coney Island Creek.  

To expand beyond the top 10 spots, I created the interactive map below, which includes all of the harbor locations that were measured in the DEP data.  Just like the analysis above, I mapped the percentage of time that water levels were unsafe for swimming.  Larger circles indicate a higher percentage of unsafe days, and thus dirtier water.  Clicking on a circle gives you fuller details for that site. 

Note that the larger circles appear more inland. The conclusion?  If you are going to swim in NYC, i guess the rule of thumb is to stay away from anything with the word “creek” in its name (and of course “canal”) and head toward the rivers. The one exception seems to be the Bronx River.  I suppose its sort of intuitive…  interior waterways have much less water to dilute waste matter and they generally move slower than their large river counterparts. (Of course this is more of a theoretical swim.  If you are ACTUALLY going to swim, hit up the beaches!) The best part of all of this?  I may have just discovered the origin of the old saying “Up sh*t creek without a paddle”.  

-Analysis done in Excel (pivot tables)
-Map formed in QGIS and then exported to CartoDB
-All Data used can be found here.

     photo tiny_little_mail_icon_zps8e7ba5c7.gif Mailing List
Filed under: DEP nyc opendata 
September 13, 2014
Mapping the 25 Best Food Trucks in NYC - The 2014 Vendy Awards

Today’s post is a bit different from prior ones in that there is no analysis. But I don’t need to run the numbers to know everyone likes good street food.  This post is about where to find it. 

I had the pleasure of Emceeing the 10th Annual Vendy Awards today on Governors Island.  It is an all-you-can-eat-and-drink competition between some of the best food carts/trucks in New York City, but it’s also a fundraiser for the Street Vendor Project, an advocacy group for the thousands of street vendors in our city. 

I’m going to be doing some posts on vendor data in the future, but in the meantime I figured that since not everyone could join today, I would make a map of all 25 nominees for the awards.  

I looked at the neighborhoods where the food trucks or carts for each of the Vendy finalists often frequent, and mapped one of their main locations.   Some food trucks move from one neighborhood to the next without any pre-determined schedule.  In those cases, if they had an associated brick and mortar store, I mapped that location instead.  For more details on any individual truck, go to the Vendy page which has links to each one’s information.  

I’ve color coded the locations for each of the 5 Vendy Award categories. Enjoy, and happy eating!

     photo tiny_little_mail_icon_zps8e7ba5c7.gif Mailing List
September 8, 2014
Update: MTA Cites Lack of “Infinite Change” in Current MetroCard Default Pricing. Forgets Credit Cards Exist.

In my six months writing on I Quant NY, three government agencies have responded to my posts.  

The first was the NYC Department of Health (DOH), which respond to a post about possible grade inflation in restaurant inspection scores.   Unfortunately, that response said almost nothing and we never heard from them again.  

The second was the NYC Department of Transportation (DOT), who responded to my exploration of a poorly designed street leading to tens of thousand of dollars in fire hydrant tickets each year.   Unlike the DOH, the DOT came through commenting that they would  ”review the roadway markings and make any appropriate alterations.”  And to their credit, within a few months, they fixed what was  a deficiency in street markings.  That was incredible!  

Today marks the third response, this time from the MTA. I was hoping they would be more of a DOT than a DOH.  Andrew Ramos of WPIX reached out to them for a comment about my last post pointing out that the current vending machines are designed to add unusable balance to MetroCards. Their response: 

“These machines do not hold an infinite amount of change and the denominations are suggested to insure there is ample change to accommodate customers who pay with cash,” a spokesperson said in a statement.

“That being said, we will certainly look at this as part of the process involved in rolling out the next scheduled fare increase slated for next year”

Let’s break this response down for its absurdity.

I won’t even get into the “infinite” part.  But there are many machines in the MTA system with no change at all.  They are Credit Card Only machines and they are all over the system.   So the MTA could make a small software change that only applies to the Credit Card Only machines with limited effort.  Moreover, a slightly more complicated software change could lead to a more widespread solution.   By asking customers upfront if they were paying by credit card, they could provide better default options tailored to the payment type, and avoid the need for “infinite change”.

So how does the MTA’s response really address this problem? By telling us they will look at it next year… with a scheduled fare increase

My last post had about a quarter million hits in the last few days, and over 75,000 Facebook likes and shares.  I’d say that shows an overwhelming interest in the area.  But a small percentage of responses pointed out that the design may not be intentional.  I wanted to believe that, since it would mean that the MTA might take responsibility for the design flaw, apologize, and promise to fix it quickly.  But instead, we got a statement simply not addressing the points in the article. MTA, if you are out there, please prove me wrong.  Fix the vending machines. 

     photo tiny_little_mail_icon_zps8e7ba5c7.gif Mailing List
Filed under: responses mta update 
September 5, 2014
How Memorizing “$19.05” Can Help You Outsmart the MTA

We’ve all been there.  The train is coming into the station, and you grab your MetroCard and quickly try and swipe it at a turnstile.  

"Please Swipe Again".  "Please Swipe Again".  "Insufficient Fare".

The last two words are killer.  You think to yourself “I swear I had a balance on this card”.   You go and check the card out and you see you have “$2.45”.  Yes, you need $2.50 to ride the subway, and you have $2.45 on your MetroCard.  Sure enough you miss that train all because of that nickel. 

How did you end up in that situation any way?   It turns out the MTA has designed it that way.  Imagine how many tourists come to NYC and leave with balances that never get used.  Imagine how many people lose metro cards with those balances that never get used.  And even if it gets used on a later refill, the MTA gets to collect the cash earlier this way!  Win win for them, right?  

But now, with some simple math, you can fight back!  

First, let’s see how the MTA tricks you out of your money earlier than you might want to release it to them.

When you are buying a MetroCard, you can get a 5% bonus if your purchase is big enough.  So you get the following screen early on in the purchase process: 


If you click the button on the left, they just got you.  Your card will have $9.45 on it, meaning you will get 3 rides and end up with $1.95.  That is a great deal for the MTA.  They get all the money from every rider who does that, and they get the interest on that until you refill again and repeat the cycle.  

Let’s say you don’t take the bait.  You click MetroCard.  Then you get this screen with three new short cuts:


Three quick options.  But wait a minute.  One button leaves you with the same $9.45 card, and gives a remainder of $1.95 after just three uses. The next one is even more frustrating: you end up with a $19.95 card, leaving a remainder after 7 uses of $2.45!   That’s right, the nickel we were talking about earlier.  The last option does not leave you much better off.   You’ll get a $40.95 card, which leads to $0.95 on your card after you use 16 rides.  So all three buttons presented leave quite a bit of “insufficient fare” on the card. 

So how do you fight back  Well, click “Other Amounts” and type your own values: 


and remember these three magic numbers:   $9.55, $19.05 and $38.10. That’s right. Never use the short cuts.  Just type in one of those numbers.  

Once you do, you’ll see your excess balances nearly vanish once you apply the 5% bonuses: 


Buy a $19.00 card?  $2.45 left on card after use.  Buy a $19.05 card?  No balance left after use!  Magic.  But what if you want a $10.00 MetroCard? There is literally no way to buy one because of the 5% bonus and the fact that all payments need to be divisible by a nickel.  Your options are to pay $9.50 to get a $9.98 card after bonus, or pay $9.55 to get a $10.03 card after bonus.   Once again, you literally can’t buy a $10 metro card from a machine. 

If you absolutely don’t want any left over money, you really only have three choices of payments below $40, as seen in the table below: 


If the pennies bother you, then maybe memorize these three numbers: $11.90, $19.05, $30.95.

So if the MTA really cares, what can they do to fix this?

Well here at I Quant NY, I’ve been hard at work coming up with a proposed software change.  After much thought, check out this before and after: 





Not a big change you say?  Echm.  That’s right.  If they really wanted to fix the issue, they could ask “How much do you want on your MetroCard” instead of “How much do you want to pay”.  But don’t count on those changes coming to a MetroCard Vending Machine near you anytime soon, given how lucrative the current set up is.   

Which means it’s up to you.  Write down the three numbers, $9.55, $19.05 and $38.10 or pick just the one that matches your buying habits best.  You could even write it on the back of your Metrocard if you can figure out how to get ink to stay on it.  (There’s a reason they are so shiny.)

A side note: one reason that the MTA may do this is to make paying with cash easier. It would be a nightmare to dispense change if cash buyers used this technique.  But that does not explain why they can’t update the credit card only machines or all other machines if they first ask if you are using cash or credit.  And of course unlimited card buyers avoid this all together.  Also, this does not include the $1 fee associated with new metro cards. 

So in closing, Math is useful.  And luckily, you don’t have to be Einstein to outsmart the MTA.  Plus, guess what year Einstein handed in his dissertation…  You guessed it.  1905.  

For the latest I Quant NY data analysis of this great city, sign up for my Mailing List (about one post a week), Follow me on Facebook or Follow me on Twitter.  I tell stories with data.

Past posts include finding and fixing the most profitable fire hydrant in NYC, showing that the Health Department is inflating grades or looking at gender and Citibike.

Ben Wellington is a Visiting Assistant Professor in The City & Regional Planning Program at Pratt Institute in Brooklyn, where he focuses  on NYC Open Data.  He holds a Ph.D. in Computer Science from NYU.

Update (9/8/2014):

Some have pointed out that this may not be intentional.  Yes, it could be the case that even though the MTA takes in about $50 million dollars a year in unredeemed excess balances that no one ever noticed this.   I don’t have any idea where these decisions come from, so I am not in a place to point fingers with any proof.  But, intentional or not, the buttons are tricking people out of their money.  If it really is unintentional, I’d be thrilled because it won’t be long until the problem is fixed given all the attention this post has gotten.  So let’s hope all the people who say the MTA did not do this intentionally are correct.  I can think of no better way to be proven wrong. 

Update2 (9/8/2014):

MTA Responds!  Read more here

     photo tiny_little_mail_icon_zps8e7ba5c7.gif Mailing List
Filed under: mta nyc subway transportation 
August 27, 2014
Canarsie Tops List of Most Flooded Neighborhoods According to ClaimStat Data

In a quest to do for claims against the city what Crime Stat did for crime, Scott Stringer recently released ClaimStat. It maps out claims against the city, including the date of occurrence, type of claim, location and amount paid out in the claim.

It’s a great step for transparency, but once again the city failed to release the data in a machine readable format.  But fear not: the amazing Chris Whong (of Taxi Visualization fame and an organizer of BetaNYC, NYC’s network of civic-minded hackers who are opening government data) posted a quick how to on extracting data files from the maps.  So with that, I was off and running.

Today’s post is on Sewer Overflow Claims. What are Sewer Overflow Claims you ask?  Well, the city’s network of sewers can only handle so much water at once.  During large storm events, they can overflow and back up onto city streets or they can occasionally get clogged for other reasons.  When the sewers do overflow, the water can cause damage to property and thus property owners can file a claim against the city for that damage. The good news in all of this is that the claims data can help us identify where flooding from sewer overflow is happening the most.  If we can identify the worst offenders, the DEP can better target infrastructure projects. 

First some quick stats.   In the two years present in the data, there were 1,168 claims filed.  The bulk of the claims were in Brooklyn and Staten Island:


The average payout for those claims which have been paid is around $4,000.

Scott Stringer’s ClaimStat report explores flooding by Community District. I decided to take a more detailed look at the underlying neighborhood data given that Community Districts span many neighborhoods.   So I split the data into Neighborhood Tabulation Areas (NTAs), which are neighborhood designations used by the Department of City Planning for population projections.  I then counted the number of claims filed in each area over the two year period to get a view of which neighborhoods experience the most flooding:


The results show that Canarsie fairs worst, with 16% of all claims.  But second behind it is the Bergen Beach NTA, which is adjacent to Canarsie in Brooklyn, and third is the Sheepshead Bay NTA, which is just south of that.  These three contiguous NTA’s make up almost a third of all Sewer Overflow Claims in the city.  The top 10 neighborhood areas on the list made up half of all claims, making these prime areas to reinvest in new infrastructure.

To get a different view,  I created a heat map, or should I say a wet-map, of all the claims in the city:


The map clearly identifies hot- errr I mean wet-spots to focus on as far as mitigating these issues.

I was also curious to see how consistently the flooding hits the same neighborhoods.  So I chose the three largest storms in the data: Hurricane Irene, Hurricane Sandy, and a record breaking rainstorm that hit the city a few weeks before Hurricane Irene.  

I mapped out the claims for each storm below, as well as all the remaining claims that were not one of the three storms.  


Interestingly, although there is some overlap, there are also distinct areas affected by each storm. To look at it another way, we can break those same three storms down in a table to see their individual effects on the neighborhoods with the most claims: 


Only Canarsie and the Bergen Beach NTA were hit severely by all three storms. The other top flooding neighborhoods seem to be storm-dependent.    

One more way to look at the severity of flooding is to explore the number of unique days in that claims occurred, e.g. if 10 claims were filed in one day it would still only count as one in this metric.  


Once again, Canarsie tops the list.  Sewer overflows that resulted in claims happen about every 5 weeks on average there. But the next five NTAs are in Staten Island according to this measure. 

So the conclusions?  Well Canarsie, Canarsie, Canarsie for one.  It appeared on the top of the list no matter how I sliced it.  It’s pretty clear that the sewer infrastructure needs upgrading there and I hope the DEP has it on the top of its list as it is by far the area with the most flooding claims. Other than that, several neighborhoods seem to vie for second place depending on what measure you use. Neighborhoods in South East Brooklyn, as well as many along the water in Staten Island seem like good contenders.  

It’s important to note that while increased claims indicate increased flooding, its not necessarily true that increased flooding will create increased claims in all affected neighborhoods .  One could imagine that there might be information gaps around this issue, leading some neighborhoods to be undercounted in these numbers.  Also, there is flooding not attributed to sewer overflow.  So as always, this data should not be looked at in isolation.  

And finally a closing thought: reducing the number of claims against the city can only be a good thing. No one wants our fine city to end up under water. 

Neighborhood Tabulation Area Data here.
Claims data here with extraction instructions here
Maps made in QGIS with Open Layers Plugin for Google Maps, Raster Plugin for Heat Map.
Charts made in Excel.

     photo tiny_little_mail_icon_zps8e7ba5c7.gif Mailing List
Liked posts on Tumblr: More liked posts »