In its launch announcement for Claude for Small Business last week, Anthropic shared a survey finding that stopped me cold. Half of the small-business owners surveyed cited data security as their biggest hesitation about AI.
That number is not a fear of change. It is not anti-AI sentiment. It is small business owners doing the math correctly. The data inside a 50-person manufacturer, a wealth management firm, a dental DSO, or a mid-sized law office is more concentrated, more sensitive, and harder to recover from a breach than what sits inside a Fortune 500 with redundant systems and a real security team. The risk-adjusted cost of a data incident is, in many cases, higher at an SMB than at a global enterprise.
If you are evaluating AI tools for your business this year, three questions matter more than any feature comparison. I will walk through each one and then show you how the major providers stack today. The post is vendor-agnostic by design. My goal is not to tell you which tool to pick. My goal is to tell you what to ask.
The Three Questions That Matter
Question 1: Does the vendor train its models on your data by default?
This is the question every announcement tries to answer in marketing copy. The actual answer almost always has the word default doing heavy lifting.
Here is what to listen for. A clean answer sounds like: We do not use your inputs or outputs for training under any circumstances on this plan, full stop, contractual commitment. A muddy answer sounds like: We do not train on your data by default. Both can be technically true. They are not equivalent.
The follow-up question: Under what conditions could that change, and how would I know? If the only way to confirm you are not in a training state is to trust the marketing page, that is a gap. You want a contractual commitment, a console toggle you can see, and ideally a console log that proves your status historically.
Question 2: Do your existing permissions hold inside the AI?
This one is more subtle and, in my experience working with regulated SMBs, more often violated.
If an employee cannot see customer financials in your QuickBooks today, can the AI act as a bridge that lets them see it indirectly by asking the right question? If the AI is given broad system access to make it useful, you have just punched through your access control layer. Whatever your group policies, your folder permissions, your SSO scopes, and your role-based controls were trying to enforce, an AI agent with system-wide read can quietly route around them.
The right pattern is permission inheritance. The AI sees what the user sees. The AI cannot return data the user could not have queried themselves. If the AI is configured with its own service account that has broader access than any individual user, you are taking on real risk, and you should know it.
Question 3: Is there an audit trail you can defend?
Every regulated industry I serve, including financial services, healthcare, and defense contractors under CMMC, asks the same question: Can you prove what happened?
If a customer or a regulator asks you, did your AI see my data and what did it do with it, you need an answer with a timestamp, a user ID, the prompt content, the data accessed, the action taken, and the human who approved it. Anything less is not auditable. Anything less will not survive an OCR review, a state insurance commissioner inquiry, or a Reg SCI exam.
Audit logging is the boring part of AI procurement and the part regulators care about most. Ask for samples of the audit log format before you commit. Ask how long logs are retained. Ask how you export them.
How The Major Providers Stack Up Today
A quick walk through the major options as of today. None of this is a recommendation. It is meant to help you ask sharper questions.
Microsoft Copilot
Microsoft’s commercial Copilot offerings (Copilot for Microsoft 365 and Copilot Chat with commercial data protection) do not train on tenant data and inherit Microsoft 365 permissions natively. Audit logging flows into Microsoft Purview, which is well understood by enterprise security teams. The sub-processor list is published. BAAs are available for Microsoft 365 with appropriate licensing.
Strength: deepest enterprise security tooling, easiest audit story if you are already a Microsoft shop.
Watch-out: consumer Copilot at copilot.microsoft.com without a signed-in commercial account is a different product and a different data policy. Make sure your team is using the commercial version, not the consumer one.
Google Gemini
Gemini for Workspace inherits Drive and Workspace permissions. Google has stated that paid Workspace customer data is not used to train Gemini models. Audit logging is available through the admin console. BAAs are available with appropriate Workspace tiers.
Strength: native Workspace integration, simple if you are a Google shop.
Watch-out: the consumer Gemini app at gemini.google.com is a different product. Same caution as Copilot.
OpenAI ChatGPT
ChatGPT Business and Enterprise plans do not train on customer data by default, per OpenAI’s data usage policies. Enterprise adds SSO, advanced admin, and longer audit retention. SOC 2 Type II is available. The consumer ChatGPT plans have different data handling defaults.
Strength: most familiar tool for most employees, low training friction.
Watch-out: connectors to internal systems are newer and the audit logging granularity inside those connectors is still maturing.
Anthropic Claude Team, Enterprise, and Claude for Small Business
Anthropic states that data on Team and Enterprise plans is not used for training by default. The new Claude for Small Business package includes permission inheritance through Claude Cowork and human-in-the-loop approval as default behavior. SOC 2 Type II is available.
Strength: trust posture is explicit in the launch language and the human-in-the-loop pattern is the default rather than an option. The approval before anything sends, posts, or pays framing is the clearest articulation of safe agent behavior I have seen from any major vendor.
Watch-out: HIPAA BAA posture at the SMB tier specifically is unstated as of the launch. Audit log depth on agentic actions is unstated. Both are expected to clarify, but they are not in the launch materials.
The "By Default" trap
The phrase to watch on any vendor’s site is some variant of: We do not train on your data by default.
The word default is doing real work. It means there is a non-default state where they do train on your data. The question is: what triggers that state, can you confirm you are not in it, and is there an audit log proving your status today and historically?
Some defaults flip when an admin checks a box they might not realize is consequential. Some flip when a user accepts a terms update. Some flip when you start a free trial of a different product from the same vendor and the trial inherits your main account data. Ask.
A One-Page Evaluation Checklist
Use this against any AI vendor, not just the four above.
Training opt-out: is there one, is it the default, is it documented contractually, and is there a console log proving status?
Permission inheritance: does the AI inherit my SSO group and folder permissions, or does it use a service account with broader access?
Audit logging: how granular (per prompt, per action, per data object), how long retained, how do I export?
Data residency: where does my data physically sit, and can I require US-only?
Sub-processors: who else touches my data, and how am I notified of changes?
BAA availability: if I am a HIPAA covered entity or business associate?
SOC 2 Type II and ISO 27001: current attestations, accessible to me?
Termination data return and deletion: what is the SLA, and is it contractual?
If a vendor cannot answer all eight clearly and in writing, you have your answer.
Closing Thoughts
Anthropic is doing more than most on this front. The launch language about not training on your data by default, permission inheritance, and human-in-the-loop approvals is closer to what an enterprise security team would write than to what most consumer AI products say. That is good for the whole industry. It pulls the floor up.
But your job as an owner is not to grade vendors on a curve. Your job is to make sure the answers fit your specific regulators, your specific insurance posture, and your specific data classification. The eight questions above are how you do that. Have more questions on what to look for? Let's chat.
Disclosure: Techvera is an MSP serving small and medium businesses across North Texas, Oklahoma, and New York. Our internal operations are powered in part by Anthropic’s Claude. Nothing in this post is a recommendation of any specific product or vendor. Evaluate AI tools against your own regulatory, security, and operational posture.
About the Author
Todd Mitchell
Chief Operating Officer
Todd Mitchell is the COO of Techvera, bringing operational expertise and strategic vision to help businesses transform their IT infrastructure.
