34 min read

Data Visualization on the Web with DataSketches
Modern Web Podcast Transcript

Tracy Lee and I interview Shirley Wu and Nadieh Bremer, long-time members of the D3.js and data visualization communities. We discuss Shirley and Nadieh’s latest collaboration, DataSketches, in which they each produce a data visualization monthly, with a common theme. We also discuss broader trends of building data visualization using web technologies.

Show Notes

Panelists

Links

The following is a transcript of the interview from the episode.


Tracy

Hello. You were listening to the Modern Web podcast where we next generation works standards and technique. I’m personally on Twitter and you can find us online at modern-web.org.

Ray

Hi everyone. This is Ray Shan. Welcome to the Modern Web podcast. Tracy and I are joined by Shirley Wu and Nadieh Bremer to discuss data visualisations on the web and their latest collaboration, Data Sketches. Tracy would you like to say hi?

Tracy

Hi everybody. So glad to have everybody on. Really excited to see or hear about the cool projects and the new things happening in D3.

Ray

Shirley would you like to introduce yourself?

Shirley

Yes. Thank you so much for having us on the modern vibe. I’m actually not in the eyes first. Podcast together for Adida sketches. That’s really cool but I guess a little bit about me. I do a visualization for the web and. I guess. That’s the most succinct thing to say.

Ray

You do a lot more than just the vision on the web. You also you also organized one of the largest DS3 user groups in the world and they also put on a really kick ass D3 conference.

Shirley

Thank you. It’s not just me I hope you those things definitely. Yes I do data visualization for the web. I’m and I’ve been part of the team to be bear user group for the last few years and in the past year I have become one of the coordinators for that user group as well as the annual D3 unconference which is really really cool.

Ray

Yeah I’ve been going to the user group me for a long time and I’m a huge fan.

Shirley

Thank you.

Ray

Nadieh would you like to introduce yourself please?

Nadieh

Sure yes yes. And I really set really cool to be in here. So I am officially astronomer and I live in Nedlands in the tiny town of Amsterdam. But after astronomy I ended up doing data science. So you always have to present your results to the plans. And I started doing data visualizations to represent the ideas that we found strategies that we found. And I found myself loving this visualization part of it more and more trying to learn more of it and then I came across the 3D printing data science conference I fell in love instantly so the week after I came back every free moment I had I I spend on trying to make my first plot. Two years later I’m now well gone all the way to data visualization design and working as. A designer and Arjun because of him service provider in Amsterdam. They’re also thinking about going freelance at some point so hopefully near future freelance as well.

Ray

That’s quite quite amazing that quite a unconventional path into data visualization coming from a astronomy background. How is that like you think if your formal training has helped you in this field.

Nadieh

Yes or no. Yes because it is data visualization and in astronomy you do we learn how to handle data on what to do with it and some of the machine learning techniques you get there. But on the other hand no I have had no like web design or. Art Design or like good really good computer science courses so that’s where that’s where I have to learn stuff for myself. So it. Didn’t hurt at all. But it could have been better I guess.

Ray

One of the fun facts I was listening to this podcast called linear digressions. So shout out to your digressions. It is a podcast about Big Data and data science and machine learning. And one of the episodes they talked about where the biggest data in the world is and one of the things you mention is astronomy because it’s like a like a limited amount of data that you can possibly capture as a story.

Nadieh

The universe is pretty infinite and there are so many stars of the galaxies all sending your light to Earth but also with simulations they have quite large data sets or simulations trying to simulate the universe is a tricky business.

Ray

I bet it’s even bigger than life like the human genome project which is which is quite amazing yeah great one of you guys like to give a introduction of data sketches whether it is how we came about how you guys came to collaborate on it.

Shirley

Sure I’ll try and take a stab essive data sketches is a monthly collaboration between not and I am and each month we choose a topic. So the very first one was back in July with movies and then. Within that topic we’re free to do whatever we want as long so long as we finish the visualization. But then a month after we finish the visualization we do a write up of all of the process and challenges we faced wrangling the data and sketching out ideas and then finally the code implementation which is out. I actually remember why we decided to do the right thing. Nadieh do you remember I think we were just like oh be cool if we just wrote down the process but it’s actually turned out. I think to be one of the most valuable things we’ve done with it based on the feedback we’ve just gotten a lot of feedback that it’s so both so cool and so educational to see not only the end product which I think for those starting out in visualization tends to be quite intimidating and fun to be able to see the process which is I’m oftentimes just as important if not more important. And as for how we came to it I think back in June I’m not here and I were talking. I think I think we were talking because I had just done some naughties tutorials and then we were talking about I’m you know doing side projects and how long it’s been since we’ve both done the completed side projects just because life and work have been so busy. And then we started it which is I think I remember I was like oh you know if there was anybody I would love to collaborate with everybody who’s so prolific and even more of a workaholic than I am I am. I think we started talking about what the shape of this would be like him and not he was just as enthusiastic if not more naughty about the whole collaboration. There was a lot of times when you said just like an idea to somebody and they be like oh that’s cool. And then you start working together and then they’ll be like oh like life got busy I’m sorry I can’t do that today or like oh I’m sorry like you know it I’m like this weekend was just to like I just don’t have time for it. But we’re not here like every day like we were that exchange and it’s really funny because I am. We only have between what 10:00a.m. Pacific Time so like 2:00p.m. Pacific Time in which we’re both awake. So that’s the only times we could talk and then otherwise we just leave. So there are a wall of text and then we enter each other when we wake up but then like we do we exchange all of these different ideas and then one day I wake up at it’s just this wall of text from Nottie saying hey I was really bored at work so some time. So I just built out our website and I was like wow cool. Thank you so much. And it’s been an awesome wonderful collaboration and it’s also turned out to be a really great friendship. I feel like I’ve talked to her more than I talk to some of my friends back home.

Nadieh

It’s been even more amazing than I would have expected when I like yes in June I still remember you saying well why don’t we collaborate. I know you actually think that. OK so how can I politely say but I. I was like No I want to work together with her. The things that she created with Tweetie face and these are all things that I have no idea how to can do that. So I really want to learn from them surely. So I was like oh I’m so glad she said that I would have never dared to ask that. And just because I was afraid that she would say like no I was so happy with that and the way that we also give each other feedback during the process. And that brings the visualizations to another level. It’s so cool I’m so happy for doing it.

Shirley

Yes definitely. Thank you so much for agreeing because that was that was like a shot in the dark. Was like I was like not you just make such a really amazing things and to be so cool if I could just like work with her. And I was like Hey and you were like Yeah I was like oh my god.

Nadieh

Basically that was for me as well. But then on the receiving site like Oh.

Shirley

Yeah. And it’s been amazing since.

Ray

So one of the things that really caught my eye was the fact that every single day the viewers zation for every month is very different. Right. I think when people think data visualization they mean to be seen like charts right. They think Excel. They think Google Finance or something like that. But I see a lot of artistic elements in your work. Can you talk about like your inspiration and like where and how do you come up with the data visualization.

Nadieh

Sure yes. So one of the things that we started doing from well one of the things we said from the start we want to do it this way is that the second week is done for sketching and we continue to really meet with pen and paper. Just don’t draw anything that you like or not. Explain I would have a rectangular circle program and draw and that also sort of you greater creativity because it’s so easy to just draw and use for all that you think of and current wherever you like. So we start out really from the data so we investigate what data that is available and then once you have the data and you get a feeling for it you the data sort of trust. I guess when the insights that you find well how can you then visualize that somehow the data helps you with that added structures that for you like there are only so many ways that you can show how many medals have been won or how or what the numbers are harder. Diving sequences in the Olympics but then I guess maybe I’m speaking on the whole surely. But for me at least I look I keep a Pinterest board that I try to go through every now and then and then it’s VII’s me and I just think of throughout the months I Trinder when I see things that interest me. I try to keep that in mind and maybe see if I can use that later on. During the sketching phase like for example where the feathers I saw I saw a peacock feather and myself. That’s an interesting shape from the small too wide and then that was right before I started on the Olympics by this project. And I just at first I thought well it would be interesting but then I I really try to see if it works with the data. So not everything will not always be inspiration will work but this one in this case is it. So I get inspiration from from anywhere not just from graphic designer is looking.

Ray

I think that that is a great idea. One of the things I try to do is also to keep some sort of inspiration board I think for a lot of us who work in UI these implementation that is really key because it’s not just about the technology behind it’s about what inspires you to things that you see in daily life. And if you keep them around you might come in handy one of these days.

Shirley

Yeah most definitely. So just to add a little bit I’m for a lot of the things that I’ve been doing on data sketches. The inspiration behind it is actually I actually got really inspired by some of the things not he did even before the cooperation’s and not he has this beautiful piece am I know this beautiful piece called I think art empire which I think she in turn was inspired by somebody else to do. But I think I saw that and I was like oh wow that’s gorgeous. And I remember I recreated it in canvas but from that moment on this concept of Not Quite Denan visualization but rather data art. And it’s basically using data to generate art and it doesn’t. There’s really a distinct difference between data visualization which is kind of displaying visuals displaying data in such a way that a person could gain insight from the visual patterns but you know it’s a lot more about using data to generate beautiful artistic pieces that don’t necessarily have to have to have to be informative. And so I didn’t even realize that that was a possibility until I saw the Nazis art in high piece. And then when I started doing the Danish sketches I am like oh I did up until that point with working with my side projects how the more visualization leaning things like I mean I need to take time to try out these kind of more art dealer art kind of things which is probably why it leans more towards the non-traditional visualizations that you mentioned you mentioned Ray. I am but definitely agree with both Nardy and Ray of that inspiration on Pinterest. Also the inspiration from like daily life and just like seeing something. Am I going to like art museums and like I tried to make take a lot of pictures all my troubles because anything that just kind of is a different way of looking at a piece of information. And I tried to take note take a picture and then maybe write it down in my sketchbook and have that the inspiration.

Ray

Yeah. Can you guys talk a little bit about the technology they use. Is it mainly D3 or do you also blend in some other technology.

Nadieh

So for me it’s mostly D3 but for data preparation I always turn to art. That comes from my data science background. That’s what I use to process data. And then after having prepared the data. I also try to prepare the data for the visualization itself so trying to calculate as many numbers beforehand so that I don’t have to do that afterwards with javascript because for me in are much quicker and faster and more sure but then I immediately take it into just like. Simple a female template and I start doing stuff with the three I’ll so don’t rely on any frameworks. And that’s also really because of my background that it’s also self taught. So I only know a few of them. I’ve never use reax for example. Surely does a lot of reac. But for me it’s really just trying to see how far can go on with native JavaScript and DS3.

Shirley

Yeah it’s really cool because recently we realized that like we actually complement each other really well because Noddy comes from a data science perspective and it come from. I’m a software engineering computer science background. And so the way that we approach even these tools is different so like nobody mentioned I have no background in art. I love to learn. I am so kind of seeing what she can do and I’m with that and the insights and even reading her write ups about the data is really awesome and for me I use for the data processing. I use a lot I used to use some python and then I realized I spent all day in JavaScript anyways in the context switching took too long so now I just use node for all of my and no it has such a rich ecosystem nowadays with all these different packages for basically getting and processing any sort of data I can imagine. So it’s been great for the front end. Definitely me. I like to use react to organize like code and make kind of like all of the Cross linked visualizations and low dash for data manipulation. A little bit more on the arm and the browser side am and then recently Krait react app has been amazing for the basically the build process that I really don’t know much about.

Nadieh

But yea like Shirley said we complement each other. It’s for me when we do the write ups every every month. It’s so fascinating to read what surely did. You should definitely read her. Oktober piece on how she got the emoji on the presidents faces like the steps that she took before that to get to that point. It’s like I I’ve never done that. And so always also for us learning from each other but then I can dig through her code and see how she you know to get these things and learn from that and use that as example to make stuff for myself as well.

Ray

There are many many other ways to do that if it were zation secesh just using pure VSS. Why. Why made you guys choose D3. Like what were some of the pros and cons of using D3 for data visualization.

Shirley

So I always say to people that, well, one, yes there are definitely a lot of other options including CSSA transformations and such and can speak to them anymore because I’m so deeply invested in D3 that I don’t know how to do visualizations any other way. But one of the things that I like to say about E3 is that it’s really hard to do something simple like I am like simple line charge something like the steps to get there is just enormous. But D3 makes it so easy to do something hard in the sense that D3 isn’t really a tiny library it’s not a graphing library and it won’t give you any line char from the out of the box but it does give you a set of utilities for it all for working with data really easily as well as a kind of mental model and how to approach visualizing data in the browser and how to approach especially transitioning between different states of data. I am in the browser and it just makes it and it also makes like animations really easy. Different interactions like Drac really easy comes with all of these different kind of like if you were if you don’t use D3 to render anything in the browser then you can use it to calculate positions for all these different kinds of trees laid out or forcefully out. So it basically means kind of like a really powerful set of tools that I can choose to use in any way I want and is actually not restrictive in terms of how I want to use it. And because of that it’s really great for creativity.

Nadieh

Yes I fully agree. It’s really the fact that D3 doesn’t create out the books charts that it has the smaller tools these lower level tools gives you a greater creativity to create whatever you have in mind. It’s, you still have to create them with as weeki shapes or as canvas. It’s so easy to create scale and color. That’s why I think we both use D3 but we also got to know each other. I guess the D3 side of the situation so we both have to have the lock there and I think for us it was obvious that E3 just came out with version 4 and we want to learn new things from version 4 so we also set from the start that everything that we would create would be done in the new version.

Shirley

Yes.

Ray

I think there are two really interesting points there one is regarding the community and that’s how I’ve met Shirley. I think D3 is really bringing people together. Perhaps you know because D3 is a little bit more complicated to use than some of the other charging libraries but I think that brings people to get are people who really want to dive deep into the capabilities of D3. I think the community gets really inspired by D3’s author Mike Bostock work. He’s done some pretty amazing work on our new york times for data visualization. I think that really ties the community together.

Shirley

Definitely. I agree with that so much both complain about how complicated D3 is is and kind of the learning curve to learn. I mean the learning curve and I also want to kind of attribute a lot of the contra of at least the Bay Area D3 User Group to how he and Johnson and Chi change set everything up and kind of the route that they set sail. I remember whenever we asked him about like how oh how it like how he managed to make such good inclusive atmosphere for use for the Media Group and he said hey D3 is so hard to learn I’m like why make anybody feel bad about it or why school. And your body like it just we were all helping each other’s suit. I think maybe that translates to are a lot of our interactions even out even with people outside of the Bay Area I am which is really cool.

Ray

What are some of the similar events will communities around where you live. Do you see a lot of the people who are really interested in D3 or data visualization.

Nadieh

Well I wish I really wish I could say yes. No not really. There is not even a meetup in abound around data visualization in general that I know of not. Also not 3 then there is a need for graphics conference in the Netherlands. One day year which I’m really happy about. So at least one day a year I can see other people who are into data because utilization although most of them are more statically based but other than that. I live in Europe I mean there’s Mahaffy but it’s also mostly based on her data visualization journalistic papers but Interactive is coming up there too. But more the technical side data visualization. No I really I really have to go to the U.S. for that to a conference or year through Shiralee how awesome the DS3 and cumquats and in price I can make this.

Ray

Perhaps this is an opportunity to start something. Amsterdam is a pretty major city and major hub. I’m sure there are a lot of like minded people. Perhaps they’re just working on the Internet.

Nadieh

That is true. That’s true I have thought about it to be honest but I also have to be true to myself with all of the side projects that I’m taking on and doing that. I just don’t have time to start it but if somebody could. Please started D3 meetup group in Amsterdam. I would be your very first follower. I can’t I just can’t manage to also do organize and meet her on the site as well.

Shirley

Yeah. I actually know how not he manages everything I remember Tarvaris uses leg and juggling dance sketches and for conference talks in September. She’s a superwoman basically.

Nadieh

It’s like don’t think too far.

Ray

So anyone on the Internet was hearing this if you’re interested. Contact Nadieh to invite her to a nearby conference or or a meetup.

Nadieh

Yes.

Ray

So the point you guys brought up which I think it’s an interesting segue to dive to dive into D3 is lay this advances is that D34.0 I think personally I think one the reason why we use D3 previously is because D3 is actually really stable. Some people may perceive it as a plus or minus by seeing the stability of the API. Really help people to build production applications using this vendor library but lately D3 came out with 4.0 which is quite a drastic change. Can Guess Huck a little bit about what has changed in D3 4.0 and how you guys have been using it perhaps differently than before.

Shirley

Sure. So, um, without getting to kind of in the weeds about the technical things, um, there’s actually been so many changes between version three in version four. And I hear that a lot of times like communities rebel like when, just like you said when it’s not so stable or when it has a major change and But personally, and I think this is reflected in a lot of the rest of the community. But personally, I just find most of the changes that I’ve come across for version four, to make so much sense that I’ve just kind of been in love with it. And so some of the major changes is that it became modular. So the name spacing changed a little bit, but also the core concept of how to update between data state. So the Enter update exit pattern that’s changed a little bit for personally on my current favorite update between v3 and before is it always, always love the force layout, which is a positioning like I’m, it’s basically a positioning algorithm for network graphs for new to link graphs. And so there was a good implementation of it in version three that’s used a lot, and probably a lot of people have seen the Les Mills example, and that Mike has, but um, you know, in version three, there were a lot of kind of, like, features around the, the force layout, things like, you know, being able to have multiple focus points, focal points, and being able to have collision detection. But all of these were like, you had to, you know, shim in like, external, like a code. And, like, you had to, like, copy and paste like, code from outside. And then it’s like, another loop through all of your data, which, like, you know, how to, like a direct performance hit, whenever, like, the number of nodes in your graph got too big, and, but with the force layout, and before, and I just think it’s so brilliant how it got updated, which is that now you can define your own forces. And then under the hood, I haven’t I haven’t taken to close the book yet. But my understanding is under the hood, and it just applies all of those forces one by one at each step of the calculation. So firstly, out is like, basically a simulation of thousands of iterations and were like your positioning the nodes, and then all the forces on the nodes at on each other. And that’s how it gets position. And the positions gets calculated. And, and now, like, all whatever forces you defined, it gets, it gets applied doing that, like under the hood calculation. So it’s a lot more performant, first of all, and second of all, like, because you can do custom, you can write your own shipment. And it’s so much easier. But um, but Mike has also written like, a bunch of just default forces that you can just pick and choose and apply. And so yeah, as well as an ability to kind of specify an x y coordinate and finding years node using a like it’s a country implementation under the hood so that’s also performing just everything got so performance with that force layout and unlike just in so much love.

Ray

Nadieh, they would you like to add to that how you have been using D3 4.0?

Nadieh

Yeah, sure. Well, in my previous month, I was also getting to know the new firstly out. But I have nothing to add to Shirley’s message, because that that said it always really amazing. Now, I’m glad he kept in the course layout, because I’ve been hacking and using that one require that quite a few things, although there have been no changes to it. So for me, it was mostly really the Enter update exit statements. So how you go from one data set to another data set and some namespace changes. But I still have a lot of d3, or I think to get to know better,

Ray

Shirley, one of the things you you touch on is the fact that d3 is a lot more modular than it was previously. And you also mentioned that you use React together with d3, I heard that the modularity really helps with other frameworks like react or Angular has that helped you at all how you use it with the raft,

Shirley

I’ll be guilty to admit that I’m really because modularity means I need to know what packages to load. And, and, you know, when I start like a react component, I might not know. And I just want to get started. So it just loaded in the whole deeds, we package and I don’t use modularity. And I always tell myself that I’ll go back, and then I will like, figure out which ones I use. And I’ll load those packages by number two. I’m sorry,

Nadieh

No, always loved the whole thing. And I might, even though I might, even though some extra plugins as well on top of that, such as the geographically based plugins, or the same key layout, which is quite nice for short, and flows. I’m wondering going to add that one to D3 for you extended anyway. But yeah, always load in d3.v4.min.js.

Ray

I think, because you guys are doing more like one off projects, perhaps it’s not as important. But I work we also use the three quite heavily and we’re also enjoying the new force layout. But we, we do use react, and previously we load the whole library in. But lately, we’ve had an effort to try to reduce the payload size. So we we’ve been working on just removing pieces of d3 that we don’t need, yes, I’m not working on it personally. But I know that that has helped a lot of people’s production app out. So I’m looking forward to learning more about it.

Nadieh

There are also some visualizations of course, visualizations made about the dependencies of these modules, things on base, and other things that do really help if you if you want to use the modules, then you can go there to figure out which ones you need.

Ray

So let’s talk a little bit about charting, because I think for a lot of people, I would imagine the charting is at least one of the primary uses of d3. So standard charts, like line charts, bar charts, and there are a lot of different libraries are built on top of the three for charting, can you guys talk a little bit about those and your experiences with them?

Nadieh

Well, I have to admit that I’ve seen a lot of names coming by, but I have not really used a lot of them, I draw from density design, which is more of an app that you can just really quickly plug in some data and get a visualization out of it. But I go, I only go to the three once I really know what I want to visualize. So I never use it for some quick and dirty visualizations to get insight into the data. Because those I always keep in our because that one has a good package called GG block to, to us to make some quick line charts, or bar charts. And I once I know it’s in the data, and I want to make something more, I guess, explanatory of it, I go to the three. So I’ve never really had to use any charting libraries on top of the tree. Because those are always too limiting for me, because when I go to d3, I want to do something a bit more artistic. And that’s never an option in the most of the turning libraries that I’ve come across. But there are I be, there are quite a few of them. I as far as I know, like high charts, maybe quite programs based on the three which I think is quite good. But I cannot speak from personal experience a lot. Really,

Shirley

Yes. So um, I also exactly the same with naughty and I’ve heard of all of these. Yeah, and also like NVD3 three and there’s a lot of names that I’ve heard in passing. But just like Nadieh, I haven’t used any of them, just because by the time I get to the coding and d3 is just so good. And in terms of not limiting me to any single chart form, and I go straight to it. And I’m I’m also I also, interestingly enough, haven’t really actually built any line charts or any of those. Because previously at work, a lot of the things that we did were network graphs, so tree layouts, forced layouts, and not as much standard charts, and even a my site projects, I think I’ll make like a histogram once in a while. But those are easy to just put together really quickly with d3 anyways, and I always do something weird on top of a some weird interaction or something that I’m that I’ve never started with the charting library.

Ray

Yea, I think it’s an interesting point, I think the trend has been building more and more sophisticated charts or data visualizations, overall, I think that’s another major reason why people gravitate towards the three people are trying to go above and beyond just simple line charts and bar bar charts. And I see that happening a lot at definitely startups.

Nadieh earlier you mentioned, data processing, using our briefly I’d like to dive into that a little bit more. I think one of the challenges that people face with data visualization is where to get the data, how to process it, how to clean it, and then how to summarize it. Because at the end of the day, we are doing data visualization, we using web technology. So payload sizes, a major concern, can you talk a little bit about your process with data?

Nadieh

Sure. So I start just with a simple Google search, trying to find terms that lead me to data sets that could either be complete data sets found in like Google Docs, Google spreadsheets, or it could be an API that gives you access to the data, if you give it the right terms, or web scraping, it can anything in the webinar for you, I try to find it, it’s sometimes it takes a while to figure out the right search terms for the large rings, the very first month, that data took actually quite some time to find these, this data set on the number of words that each character said, and she knows the movements. And then when I have that data, I tried to get to load it into our so either write the code for to get the IP or write the code to get the to read the data, set it from a CSV clean perfectly clean into our and I really have the raw data, and then I need to figure out what do I want to do with this data? So do I want to have an average over all of the numbers that I have for all of the movies or all of the metals? Or do I want to have low level. So at first, I tried to just clean it and make sure that the data is correct. So I try to make some some aggregates and some summaries, just to see if the numbers add up so that I can resemble compare that to numbers that are available somewhere else online, but then I go usually go into the next phase is already sketching. Because then I need to figure out what do I want to do with this data. And then I start scratching the light, know what insights that I want to convey or what numbers that I want to eventually visualize, then I go back to my data step talk. So I go back to our and I try to get this get these numbers actually, from this clean raw data. And that in again, usually involves some loops to get means or other things, or some vector eyes stuff. And then I go again, to the coding part, and I try to figure out, Okay, so how to get this data in to the visual that I want. And then I might actually go back into our, again to, again, pre process some numbers. But as you are hearing already, this is really only working for if you have like a static data set, because it’s a process that is done really on that data set for that, particularly the data set to get the insights that are there onto your screen. But I guess if you want to have it more like on a on a website, or maybe more life data, you you would go I guess more into pricing and kind of setup where if you want to do some more advanced data preparation processes, they would you would write some function that would do that for you as much as it can beforehand. Because if you make your data visualization, you know, what you want at the end. So you can do all of the data preparation all the way at the start and then feed it into your data visualization on the content sort of explains it if you have more specific question.

Ray

Yeah, I think that helps a lot way of the data is, say, like a terabyte Do you think give this workflow can still work? But would you add something or, you know, do it do something differently?

Nadieh

Well, if you have like a terabyte on data, the question is, how are you how are you going to show that on your screen, I don’t, I don’t, I don’t have a terabyte of pixels on my screen. So you won’t have have, like, one pixel cannot even be one data point. So usually, the more data you data that you have, the more you have to think about aggregation, or showing are picking out a few interesting parts. And then sort of still aggregating the rest, it’s, it gets into a different kind of question. So how visualize a small data set for this is a really big data set is to kind of separate things that lead you to lead you on a different path. But I guess for a static piece that you want to meet, you can use an a poster, then you can use whatever the amount of data that you want, you only have to process that once on your own laptop it but yeah, usually people have to get it on our website. So you really wanted to have a small as possible. So thinking about even aggregation gives you the best results is a good thing to keep in your mind, if See, instead of plotting all of the points, maybe you can plot means or the the average within a conference band around it, instead of every point that you have.

Ray

That makes a lot of sense, I think the key there is aggregation for sure.

Shirley

I’m, and just a quick new Sue, you know, I haven’t worked with quite a terabyte before. And, but I’ve worked with larger data sets. And I think a lot of times when it’s a terabyte, or when we get a large data set, a lot of it is actually information that we don’t need. So I definitely agree with aggregation. But also kind of what not, he was saying about, like, figuring out exactly what is needed individualization and that takes a lot of trial and error. And then it might change throughout the course of development. But, and being able to then figure out what exactly the are the attributes of the data that we need, and then filtering out everything else. And I experienced this with the latest October data sketches on a much smaller scale. But, and I had like, a few data and data sources that were like, seven megabytes, which does not sound like a lot of all and, but this seven megabytes in, you know, even compressed is, is quite a bit of low tech in the modern or it does take some little 10 in the modern browser. And then once I figured out that I actually didn’t need majority of that data. And because, you know, that’s it kind of just came with all of those attributes and features. And I brought it I just filtering out the trainees attributes, brought it down to about a megabyte. And then after I basically bundled it with create react app. And my whole thing was just 400 kilobytes and and, and so that was great for load time. So definitely figuring out and trimming out and the fact that we don’t unlike any fat, and after we figure out exactly what we want to exactly what the data we want to show is

Ray

Great, I think that makes a lot of sense. Well, I think this is a good point to wrap up on. Today we talked about share the data, your latest collaboration of Data Sketches, we talked about data visualization in general, using web technologies. We talked about the advances in d3, how important the three is. But I think the key thing here is really d3 is is just one piece of data visualization. At the end of the day. It’s a lot of it’s about the engineer and the artists judgment calls, it’s about processing data, gathering the data, it’s, it’s just a lot more about it. So we definitely encourage everyone to go out there and learn more about data visualization. All there’s aspects of it, surely, how do people find you and follow you on the internet?

Shirley

Oh, and so I have a Twitter handle. And it’s @sxywu. And really briefly, those are my initials. It was not intended. And so please follow me on there, as well as the same for my website as sxywu.com. And thank you very much for having me on their show.

Ray

You’re welcome. And Nadieh how do we find you?

Nadieh

Yes, I’m also on Twitter @nadiehbremer. But I am going to tell you to look up how to write my name online. But you can find me through the Data Sketches website. But I also have a personal website you know, that’s called Visual Cinnamon where you can also try and reach me on twitter through that one as well.

Ray

I personally highly recommend now these websites a lot of really quality content and really amazing work on there.

Nadieh

Thank you very much!

Ray

Great. And you can find me on Twitter at @rayshan. You can also find Tracy on the internet. Tracy, how do people find you on the internet?

Tracy

It’s just @ladyleet on Twitter. So thanks so much guys. You know, it’s also really nice to see the passion that you guys have for each other. So it’s really inspiring and such a refreshing change from all the guys I talked to that don’t have warm and fuzzies. It’s really nice. They’re like you’re cute Tracy.

Shirley

Yeah, come to the D3 meetups or the D3 community and we try to keep it warm and fuzzy there.

Tracy

Yeah, it’s pretty great.

Ray

Great. Well, thanks everyone for joining us. You can find the modern web on Twitter @motherweb_ and on the web at modern-web.org. We will see you soon in the next episode of the Modern Web podcast.

I'd love to hear what you think about this essay. Your feedback makes my work better. You can chat with me on Twitter and Hacker News .