Problems solved: conversations with devs building the modern internet | Vol. I, Spread Group
Martin Breest is a Senior Software Architect at Spread Group, a collection of global, print-on-demand online-retail brands based in Leipzig, Germany, and a Fastly customer. Our Principal Developer Advocate, Andrew Betts, spoke with Martin about his team’s creative use of chatbots to run Fastly commands, his philosophy on making mistakes, and what challenges he’s most looking forward to solving in the coming year.
Andrew: One of the things I most love about customers who use Fastly in a creative way is that you make us think differently about our products and what they're for, and how customers are using them. You've done quite a lot of interesting things on our platform — tell us a bit about the chatbot that you built and what you use that for.
Martin: That's probably the coolest piece we’ve created. The chatbot itself is not very technical, but the coolest thing is what you can do with it: the CDN layer is no longer a black box or something you need to ask Ops to configure, it's now something everyone can touch.
I actually came up with the idea for the chatbot during a presentation from another CDN vendor. The problem was that it wasn't easy to implement on our previous CDN, so we just never did anything about it. And then, coming to Fastly, we've got all the APIs available, and you've got dynamic snippets suddenly, and you could roll out changes in seconds, and then the idea came into reality.
I wanted to use the chatbot to really empower developers; to put all the CDN capabilities in their hands, allow them to roll out changes, to purge when they want, to start A/B tests, to block IPs — all in the context of a team chat. So, someone might say, "Hey, do you see that we have a pen test running?" And then someone might say, "Yes, I can see that. Can we do something about it?" and then someone else says, "Of course. We can just block it." And then someone would just block the pen test in the context of the chat, and everyone on the team knows that you did it. That's kind of cool if you ask me.
Do you find that having that chatbot available has changed the culture of the team or the way that the team approaches operations in general?
When we worked with the old CDN vendor, I had to go to create a ticket for operations, and at some point during the day — or the next day or that week — they made a change to the CDN layer. Then, maybe, I got my configuration rolled out, and maybe it was the right one, maybe the wrong one. And then it started all over again.
Now, it's all going much quicker. All CDN configuration is in Git and we can roll out changes using our tools. We have, for example, the Spreadshirt team, which usually asks me to make a configuration change. I can do it for them and tell them using the chatbot, "Hey, I just rolled out your requested change. Here is the new Git revision. You can see what I did. You can check it out."
And other teams, like for example the Teamshirts team, they do a lot of operations themselves now. In the beginning, I did it for them until they realized how easy it was and started to do it themselves.
We really empower developers to the degree they want and everything is much faster now. So configuration changes, purges, test rollouts — they’re just in the flow of doing things now and we don’t have to wait for a couple of days, we can get it done right away.
And it gives you some confidence as well. I suppose the anxiety and stress over releases go away.
Yeah, we also have live logs for everything, so if you break something you can just fix it. We trust our people. We give them the tools. They can make mistakes — but hopefully not make the same mistake twice.
That's a pretty good philosophy. You mentioned logging — you log an awful lot from Fastly.
Yes. There’s a story here as well. With our previous CDN vendor, we got logs with a one-hour time delay. That was one problem. The other problem was — and I'm not kidding here — that we had only 40 characters for custom data.
And then we moved to Fastly and we just turned on live logging and now we can log everything you can imagine — simple things like request URLs and status codes, of course, but also more complex things too, like geoinformation, device information, and all kinds of headers. I recently found out about proxy type and proxy description, which is a pretty cool thing for security research and getting more insights into attacks or fraud, for example.
I guess the bottom line is the data is available and you use it for whatever you want. Are you using ElasticSearch to receive the logs?
Yes, we stream them into an ElasticSearch cluster and have them available for research right away. That is one nice thing. And the other nice thing is that we can connect them to our data center routing logs, our application logs, our error logs, and logs certain teams write for their own purposes. And by doing that, we get really good real-time insights into what's going on.
For example, we had problems with credential stuffing attacks last year and we could — in real time — see, "Okay, things are happening now. Okay, we apply a block for a specific user-agent or for a specific user-agent and header combination now.” And you could see in real time that our block actually worked. That's pretty cool.
And not only do we have the logs for operational purposes, but also for reporting. To have a monthly CDN report that makes transparent what the CDN actually costs and why — that was not possible before and it's definitely possible now. And that's a major difference.
We should talk about seasonality since we're right in the middle of holiday shopping for most retailers. What changes for your team when you enter the holiday season and what's different this year because of COVID?
Spread Group is a print on demand business. We have production facilities in different countries in Europe and North America. We have actual workers in those facilities printing on t-shirts, cups, and other clothing and accessories. So, because we are not only a digital company, but also deal with physical goods, that involves safety regulations, lockdown restrictions, and potential factory shutdowns. And that's what happened in March and April. Parts of our factories had to close down and that made it difficult for us, because we didn't know how long the lockdowns would last, how long we could keep up our cash flow, things like that.
But then, of course, the business came back. The thing that kind of saved us, actually, because sales were down in March and April, was masks. People needed them and we were able to print anything they wanted on them.
Then the online shop owners at Spreadshop came back as well. And because we’re also a Shopify print-on-demand fulfiller, those merchants used our SPOD service to fulfill their orders. So, what began as a pretty tough year actually ended pretty well for us. 2020 will be another record year for Spread Group with double-digit revenue growth.
You're using A/B testing as well. What kinds of things are you testing? And how are you measuring the results that you're sending back? On a personal note, I'm interested to hear your answer because I wrote the solution pattern for AB testing using Fastly. So I'm curious to know whether you're using that or whether you're using a different mechanism.
I basically use your solution. I do add a few dynamic snippets to the service configuration that contain the generated code, and then we start and stop tests using our command-line tool or the chatbot.
And what we test is what I consider to be more complicated stuff — things that involve development work. For example, testing one version of a marketplace list with images showing just the designs (design view) against another version with images of the products with the designs placed on them (product view). Or a t-shirt designer in a certain layout versus another t-shirt designer in a different layout. And then it really comes down to having the Analytics team looking at the numbers and the A/B Test team testing the hypothesis and evaluating the outcomes.
What's next for you in terms of big technical challenges you look forward to digging into?
One thing that is always a problem for us is image delivery. We sometimes change product images, for example. What was the last thing I had? I think it was a cup and the image of the cup changed which means that all the images you have with designs on cups need to change.
So your canonical cup image changed and then all of the possible artwork that you have available that can be rendered on a cup needs to be changed? That's really interesting. And so you'd potentially use Image Optimization to actually superimpose the artwork in the generic backgrounds or you still see that as something you would do at origin?
That's a very good question. There are three things that should be easy to do with Image Optimizer for us. First is delivering different image formats like JPEG, WEBP, that's the easiest one. The second is resizing images, like delivering a 378-pixel version, a 500-pixel version, and an 800-pixel version. That's an easy one too. The third thing, overlays, becomes a little bit more difficult.
Before we added 3D model images — real people wearing our products — it was a little bit easier because you just had a plain t-shirt and we had a design attached to the plain t-shirt. And that could be done using 2D transformations and overlays.
But with 3D it's a little bit more complicated. It's not that you always attach it plain to the t-shirt. Sometimes you have 3D model images that are a little bit tilted and you have to attach the design using the right perspective. Sometimes the design is attached to the arm, you look at it from the front and you only see a piece of the design on the arm. So it's a bit more difficult. We have our own 3D rendering process written in Java at the backend. But the idea would be to find out how we can make that easier, more scalable, and try to move it to the edge.
I was just thinking — one possibility here could be that, if you had a source image with a distortion mesh attached to it, then in theory you could do a matrix transform on the artwork such that it would fit the distortion mesh on the background at the edge. You wouldn't be able to do that in VCL or in our current Image Optimizer, but you'd be able to do it in Compute@Edge.
I'm really looking forward to doing that. That would be really great.