Data distancing delivers data remote workers can count on | SC Media

Data distancing delivers data remote workers can count on

September 29, 2020
Google’s staff won’t return to work until the summer of 2021. Today’s columnist, Aron Brand of CTERA, explains why companies preparing for a long work-from-home period because of the pandemic need to embrace data distancing: the ability to ensure that remote workers can work on their data with the same performance and security they had at the main office. (Photo by Alex Tai/SOPA Images/LightRocket via Getty Images)
  • Minimize latency. Remote work models require enterprise IT staffs to deliver fast, interactive data services across greater distances than ever before. If the company has consolidated file storage in a single, centralized datacenter, it's difficult to provide a high-speed user experience because of network latency. Traditional file storage products were not built to handle the latencies and connectivity issues stemming from wider enterprise data topologies. Thus, to ensure a “local” file access experience, IT teams need to locate data files near the users. They can achieve this by manually moving the files, or preferably, by strategically deploying caching devices. 
  • Data security. While companies require fast access to data, it shouldn't come at the expense of security. When an organization exposes its global data fabric to remote nodes and endpoints, it creates a security challenge as by their nature, these edge locations do not have strong physical security. It’s important for a solution to rigorously control what data users can access at each remote node or endpoints. While VPNs can meet security needs, they are notoriously clunky. Companies need a way to extend corporate file systems to remote users securely without adversely impacting the user experience. Regardless of where users are located, secure file access and the ability to enforce security policies on data movement are crucial.
  • Consistency vs. availability tradeoff. Distributed data fabrics vary in their levels of consistency. Some data fabrics use an “eventual consistency” model, and handle inconsistencies (such as two users concurrently editing a file)  by creating a conflict file. Other products implement strict global locking, at the cost of availability and latency, as a global locking service often becomes a single point of contention and cannot get accessed during network disconnections. These tradeoffs are caused by the CAP theorem, which states you can have at most two out of three in any distributed storage system: Consistency (C), Availability (A) and Partition Tolerance (P).
  • Data migration. IT teams often find migrating enterprise storage from legacy systems tough to tackle. To ease migration issues, choose a solution that has strong migration tools from legacy NAS, allow retaining security settings such as Windows ACLs, and backward compatibility with existing filers – by exposing the data using the ubiquitous SMB and NFS protocols.
  • Anywhere availability. Employing an enterprise data fabric (aka global file system) that makes data accessible to authorized users from anywhere – at headquarters, branch offices or home – has become a necessity for anywhere availability. Files are cached at the edge (either at the endpoint or using regional caching nodes) to ensure low latency access from anywhere across the enterprise. Caching also delivers partition-safety, allowing nodes to work offline in case the company loses connectivity, and to resynchronize once it regains connectivity. This synchronization also ensures business continuity and facilitates global collaboration among remote users.
  • Cloud bursting. This lets organizations access on-premises data quickly and efficiently, while at the same time enabling the heavy data-crunching needed to occur in the cloud.
  • Dark data. It’s imperative that enterprises corral their dark data on unknown BYOB and WFH devices and delete it. Companies should strive to have all of an organization’s enterprise data at the fingertips of all employees, everywhere. 
  • Security. From a security perspective, the data fabric should rigorously control all accessible data at each remote node or endpoint. Security should follow a zero-trust approach, where remote nodes and endpoints can only access a strictly controlled subset of the corporate information with explicit permission, rather than being granted privileged access to the infrastructure.
  • Agility. Organizations should engage with products that allow for agile collaboration on data between remote workers, users at the organization’s headquarters and remote branches across the world. It’s especially important now that working from a remote location has become “the new normal.”
prestitial ad