AdvertPro has been designed to be as easy as possible to operate in a cluster. To that end, each AdvertPro instance in the cluster has an equal role. There is no master server in charge of the other servers. This makes both scaling and failover very easy to accomplish. To increase capacity you simply add an additional server to the cluster. When a server goes offline, either for planned maintenance or due to failure, your load balancer can simply route traffic to the other servers in the cluster until it comes back online.
None of the AdvertPro servers communicate with each other. They operate completely independently with their own embedded in-memory databases, which means as you add additional servers to the cluster you get an almost 100% linear increase in capacity. Meaning that if you can handle 4 million impressions per hour with two servers you could handle 8 million impressions per hour with four servers.
Synchronizing the data across all of the AdvertPro servers is accomplished by scanning the MySQL database once a minute for changes to campaigns, media, zones, etc... This scanning is extremely efficient. Each item in the database has a unique ID number along with a revision number. Each server in the cluster needs only to query a list of those two numbers, which is very small. Only if a server has an older revision of an item does it need to fetch the updated data and it only fetches the data for that specific item.
In distributed computing terms this is known as eventual consistency. Meaning that changes do not occur in real-time across all servers, but eventually they will all have the same data. To wait a mere minute for changes to propagate is a practically irrelevant trade-off to get near linear scalability.
AdvertPro allows you to run all of its major components on a single domain name, however, each component may also have its own domain name for various reasons.
|Component(s)||Example Domain Name||Request URI Prefix(es)||Purpose / Use Cases|
|ads.yourdomain.com||/servlet/control/*||The control panel and API may be given their own domain name, such as ads.yourdomain.com to make identifying and distributing their requests easier if your load balancer can't identify them based on request URI patterns.|
|The ad server may be given its own domain name, such as serving.yourdomain.com to make identifying and distributing its requests easier as well.|
|File Serving||creatives.yourdomain.com||/servlet/files/*||The file server may be given its own domain name, such as creatives.yourdomain.com which you can CNAME to a CDN (content delivery network) provider of your choosing and have them use serving.yourdomain.com as the origin domain name they'll pull content from.|
AdvertPro writes file to three sub directories within the /usr/local/tomcat/webapps/ROOT directory that you might wish to map to network storage:
|Directory Name||Suggested Handling|
|backup||The built-in backup will run once a day at 5am by default. It creates a compressed backup in a single ZIP file and stores it in this directory on one of the servers. Remember, AdvertPro has no master server so whichever server in the cluster says its going to do the backup first gets to do it. If you're going to use the built-in backup you should map this directory to network attached storage to keep them all in one place.|
|config||The configuration files in this directory store the settings for your AdvertPro license key, accessing the MySQL database and the MaxMind GeoIP database. You might wish to map this directory to network attached storage. Of course, if you wish to give each AdvertPro server a different username/password to access MySQL you should skip doing that. All other settings are stored in the MySQL database and will automatically be replicated across the cluster.|
|logs||Each of the AdvertPro server in the cluster maintains its own error and event logs, however, you may wish to map this directory to network attached storage. Having the log files consolidated into one place simply makes it easier to review them as opposed to logging into each server to check them. Since the hostname of the server is written into each log entry you'll still be able to easily identify which server produced a given log entry.|
There are multiple ways that you can go about implementing load balancing. Hardware load balancers may be used, but they are often costly to implement and more difficult to operate. Software load balancers, such as Nginx are generally more cost effective, easier to operate and provide more flexibility. DNS based load balancing can also be used, however, be aware that it will not provide adequate failover if implemented without automated health checks.
Some cloud providers also provide what are typically called cloud or virtual load balancers. They work fine, but be aware that cloud servers are not suitable for high traffic deployments. The reason being that cloud servers simply can't provide the necessary disk throughput due to the fact that they run off of large network attached storage arrays. Network latency will kill a database intensive application. Dedicated hardware is definitely the way to go, so don't say you weren't warned if you choose to go that route. That's not to say that you can't take a hybrid approach though! You might choose to use cloud load balancers with cloud servers for ad serving and file serving while using dedicated servers for MySQL.
Whichever method you choose, using eventual consistency does put some strict requirements on how traffic should be distributed between servers in the cluster and if done properly you can make it completely transparent.
As users make changes on one server those changes will not immediately be made visible on other servers. It would cause great confusion if users were getting different data every time they viewed a different page in the control panel. This is best dealt with by configuring your load balancer to route control panel and API requests only to one specific server in the cluster with optional failover to another server. The reason for this is that it hides what's really going on with the eventual consistency. Forcing all users to make changes on the same server gives them all the same up-to-date view of data changes!
All ad serving requests must use sticky-session load balancing. This is due to the fact that user sessions are not replicated in real-time between all servers in the cluster. Failure to use sticky-session load balancing will result in erratic behaviour from many features that depend on user sessions, such as action tracking, competitor filtering, frequency capping, retargeting or roadblock campaigns.
Requests for files that are served by AdvertPro may be sent to any server in the cluster. There are no specific requirements as to how you should balance these requests. In fact, these requests may be offloaded to a CDN to improve loading speeds and reduce load on the cluster. A cache server such as Varnish would also be suitable for serving files, but won't provide the same benefits as a CDN.
The absolute minimum number of servers to run an AdvertPro cluster is two. However, you must consider that if one of those servers goes offline will the other server be able to handle your load? You always want to make sure that you have at least one more server than you actually need. Running a cluster with just two servers would only be a good idea if it's done purely to provide failover rather than load balancing.
As you can see, we recommend running Nginx in front of Tomcat. Nginx is able to handle a lot more concurrent connections than Tomcat. It can also cache static content from Tomcat and perform CPU intensive operations such as GZIP compression and SSL encryption, which takes a lot of load off of Tomcat. This will greatly improve overall throughput compared to running Tomcat in standalone mode.
You also want to run MySQL with master/slave replication with one or more slaves. This allows you to distribute reads and it also gives you a hot backup to fail over to if you lose your MySQL master. AdvertPro is, however, actually able to tolerate both the master and slave(s) being taken offline at the same time.
While running multiple slaves is a good way to deal with distributing reads, there is no suitable way to distribute writes and AdvertPro is a write-heavy application. Using RAID10 on your MySQL servers is a must and if possible use the fastest SSD disks that you can get. TokuDB is also something to consider when your database size becomes a bottleneck. It uses fractal tree indexing, compression and write grouping with many optimizations for SSD disks. You can expect to get a 1:10 compression ratio or better, which means simply switching to TokuDB will reduce the amount of data being read from and written to disk by a factor of ten! This does wonderful things for both performance and scalability. Plus, thanks to the power of fractal tree indexes, you can actually eliminate all of the secondary indexes on the statistics tables. This further speeds up writes, reduces disk space usage and has no negative impact on query performance. Another benefit of TokuDB is the elimination of slave lag, so that's yet another reason to give it consideration.
Scaling out an AdvertPro cluster to increase capacity is much like any other web application that's built around a MySQL database.
As you can see, we recommend running Nginx/Tomcat and MySQL on a separate group of servers. They should be connected by a 1 Gbps switch with a private backend network. You might be tempted to try and run MySQL on the same servers as Nginx/Tomcat. For low/medium traffic that may be fine. However, as you add more Nginx/Tomcat servers to your cluster keep in mind that you'll be putting more load on MySQL. Eventually, you will run into a situation where MySQL is CPU/memory starved and at that point it will be a real beast to move onto its own dedicated hardware. It will save you a lot of pain and avoid what will likely be a long down time if you put MySQL on dedicated hardware from the start.
So, ideally, we recommend that you start with a total of four servers: that's two servers for Nginx/Tomcat and two servers for MySQL with master/slave replication and grow from there as needed.
Much of the administration work that needs to be done on an AdvertPro cluster is performed automatically by AdvertPro for you. It can do backups, cleans up temporary data, rolls log files, deletes old backups and log files, etc... It even periodically optimizes the MySQL database tables and can repair them on-the-fly if they crash, so you can sleep easier at night. The only thing you really need to worry about is that your servers have adequate CPU/memory and disk space.
Certainly, at some point the built-in backup solution will be outgrown by the size of your data though. You should consider running an additional MySQL slave just for backups. When you want to do a backup, simply stop the slave and run mysqlhotcopy on it to create a snapshot of your database. This will be much faster since it does a straight file system copy of the table and index files. TokuDB also offers some help here as it can perform a hot backup on a running master or slave without blocking reads/writes, however, that functionality is only available in its enterprise version.
Follow our Nginx Installation instructions for setting up AdvertPro on the first server in your cluster. Nginx isn't absolutely necessary if you won't have high traffic. In that case, you might want to opt for the less complicated Tomcat-only setup by following our Tomcat Installation instructions instead.
Complete steps 1, 2 and 3 from the same installation instructions on each of your additional servers. Once those steps are complete, you should copy all of the files from the /usr/local/tomcat/webapps/ROOT/config directory on the first server to each of the additional servers. Then you can start up Tomcat and Nginx on each of the additional servers as shown in step 4. You won't need to complete step 5 again as that step should only be run on the first server to initialize the MySQL database.
Once all servers are set up and running the final step is to configure the ad server and file server to use separate domains if that's desired. Log in to the AdvertPro control panel as the admin user. Then go to Settings > Expert > Cluster where you can enter the Ad Serving Domain and File Serving Domain. If AdvertPro is deployed in the ROOT directory you should leave the path fields empty. Paths should otherwise start with a slash if AdvertPro is deployed in another directory, i.e. if it's deployed to /usr/local/tomcat/webapps/appname the path should be entered as /appname
The control panel will automatically bind to any domain that resolves to the IP address of the server(s) it runs on, so it needs no special configuration with regards to domain names.