• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer

My TechDecisions

  • COVID-19 Update
  • Best of Tech Decisions
  • Topics
    • Video
    • Audio
    • Mobility
    • Unified Communications
    • IT Infrastructure
    • Network Security
    • Physical Security
    • Facility
    • Compliance
  • RFP Resources
  • Downloads
  • Podcasts
  • Subscribe
  • Project of the Week
  • Latest News
  • About Us
    SEARCH
Network Security, News

The IT Security Threats Hiding in Unstructured Data

The solution to IT security threats caused by unstructured data needs to understand a few different kinds of contexts, not just linguistic analysis.

April 7, 2021 Adam Forziati Leave a Comment

IT security threats in unstructured data

Unstructured data — the data that doesn’t have any predefined data models — can be difficult for an enterprise to locate and digest. Emails, text files, photos, videos, call transcripts, and business chat apps all congregate a ton of data, which ends up just floating around in the metaphorical ether. But what’s floating around might also cause IT security and business continuity nightmares for businesses who don’t rein it all in.

Unstructured data currently makes up more than 80% of enterprise data and is growing at a rate of 55-65% per year, according to Apoorv Agarwal, co-founder and CEO at Text IQ, an artificial intelligence platform.

These are the most common IT and business security threats hidden in unstructured data that Agarwal says enterprises are often unaware of until it’s too late:

Personally Identifiable Information & Personal Health Information

Failing to redact all the PII and PHI in files might leave an enterprise in grave danger. Often, PII isn’t as obvious as a name or address.

Special category information like political alignment, religious belief, or sexual orientation might need to be redacted as well.

According to Agarwal, part of the problem is there’s just too much unstructured data — the volume is extremely high, especially with COVID.

Related: 3 Important Technology Trends Affecting Business in 2021

Now that people are working remotely, people are communicating more over email and text and other forms of communication that are now being recorded.

“But the other challenge is to be able to find this in an automated manner. There are probably not enough humans on Earth to actually go through all this data and find this PII and Ph.”

Code Words

When committing insider trading or other fraudulent activity, a person is likely to disguise their activity behind code words. Hopefully this doesn’t happen often or at all — but it begs the question if you know what your colleagues really mean when they say “Rueben sandwich?”

Data loss prevention and compliance tools, such as the ones used by financial institutions, use keywords and regular expressions to flag certain kinds of communications, especially those between people who are making an investment and people who are doing the research.

The problem is, the number of false positives are too high, Agarwal says.

“These systems don’t catch all the things they need to be catching. So it’s both like a false positive problem, but they’re also missing things that that need to be caught.”

The Same Person Appearing Under Different Names

In files, people can be referred to by their first name, by their initials, by a misspelled version of their name, or even a different name entirely.

“In general, communication has become extremely informal, as opposed to 100 years ago, when employees would be very formal in their writing,” Agarwal says.

“Seeing communication become shorter, more frequent, and more informal. We can encourage employees not to misspell, but since people are doing this on the back end, what becomes problematic is identifying certain people or identifying certain things. Machines need to go in and do that normalization, and this is one of the problems we’ve been able to solve using machine learning.”

Sexual Harassment

Inappropriate dialogue or NSFW photos or videos could be lurking in a company’s Slack channel. A lot of this falls on company policy and employee education.

But from an IT perspective, it goes back to data loss prevention tech that has certain keywords and regular expressions to flag communications that may have instances of sexual harassment.

“I think every manager will tell you that there are too many false positives, too many things to look at, and the system is probably not catching all the things that need to be caught. So it’s a pretty hard problem to automatically identify and to be accurate.

“It’s not just about the kind of language people are using; it’s about who’s communicating with who and what their roles are.”

Unconscious Bias

An enterprise might not be aware of a department’s potentially discriminatory hiring or performance review practices until there’s a lawsuit.

By definition, this bias is unconscious. Humans have their own biases, so it’s very hard for us to find unconscious bias, it requires going through a lot of data to start noticing that a manager uses personality traits when they’re reviewing a female, tor work product traits when they’re reviewing a male.

“This is a task which machine learning has to be brought in — and not supervised machine learning,” Agarwal says.

“Supervised machine learning requires humans to come in and label data and thereby the human bias gets injected into the machine. So the right solution here is using unsupervised machine learning methods to to find instances of unconscious bias.”

What to do about unstructured data

So what can IT departments do right now to prevent these problems from plaguing their organizations, and what specifically needs to be done with unstructured data?

Agarwal says IT needs to bring more machine learning and AI to bear that understands context. The solution needs to understand a few different kinds of contexts, not just the linguistic context, but also the social context.

Tagged With: Data Security

Related Content:

  • Google BeyondCorp Enterprise Google Releases Chrome Privacy, Security Fixes
  • Chris Krebs CISA fired, CISA Ransomware Campaign U.S. Agencies: Russian SolarWinds Hackers Leveraging Five Older…
  • Remote Work Productivity, tips for 2021, carbon emissions Will Continued Adoption of Remote Work Technologies Cut…
  • FBI Microsoft Exchange Server FBI Removes ‘Hundreds’ Of Web Shells From Compromised…

Free downloadable guide you may like:

  • These Are THE Key Issues For CIOs in 2021

    In this new research survey from The Hackett Group, it was found that IT priorities are geared up for an aggressive and accelerated transformation agenda. The IT department is poised to become a strategic partner with their business and guide stakeholders through a year of growth. This is the year of experimentation and adaption as […]

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Get the FREE Tech Decisions eNewsletter

Sign up Today!

Latest Downloads

Tackling the Virtual Culture Dilemma

COVID-19 has turned much of our lives upside down. At over one year into the pandemic, many of us are still working from home, which has been the b...

These Are THE Key Issues For CIOs in 2021

In this new research survey from The Hackett Group, it was found that IT priorities are geared up for an aggressive and accelerated transformation ...

These Are The 2021 Trends in Control Rooms And Operation Centers

Join Shelley Johnson, Principal Engineer at The MITRE Corporation, Shane Vega, National Business Development Manager at AVI-SPL, and Dan Griffin, V...

View All Downloads

Would you like your latest project featured on TechDecisions as Project of the Week?

Apply Today!
Sharp Microsoft Collaboration HQ Logo

Learn More About the
Windows Collaboration Display

More from Our Sister Publications

Get the latest news about AV integrators and Security installers from our sister publications:

Commercial IntegratorSecurity Sales

Footer

TechDecisions

  • Home
  • Welcome to TechDecisions
  • Subscribe to the Newsletter
  • Contact Us
  • Media Solutions & Advertising
  • Comment Guidelines
  • RSS Feeds
  • Terms of Use
  • Privacy Policy
  • Twitter
  • Facebook
  • Linkedin

Free Technology Guides

FREE Downloadable resources from TechDecisions provide timely insight into the issues that IT, A/V, and Security end-users, managers, and decision makers are facing in commercial, corporate, education, institutional, and other vertical markets

View all Guides
TD Project of the Week

Get your latest project featured on TechDecisions Project of the Week. Submit your work once and it will be eligible for all upcoming weeks.

Enter Today!

© 2021 Emerald X, LLC. All rights reserved.